In a Hadoop cluster, three types of nodes
Master nodes control which nodes perform which tasks and what processes run on what nodes. The majority of work is assigned to worker nodes. Worker node store most of the data and perform most of the calculations Edge nodes facilitate communications from end users to master and worker nodes.
Some nodes have important tasks, which may impact performance if interrupted. Edge nodes allow end users to contact worker nodes when necessary, providing a network interface for the cluster without leaving the entire cluster open to communication. That limitation improves reliability and security. As work is evenly distributed between work nodes, the edge node’s role helps avoid data skewing and performance issues.
Despite the three specified types of nodes in a cluster, the roles of each node type are not hard-coded. Administrators are able to assign service roles to any node type. However, some vendors may specify the distinction between nodes to avoid the risk of creating performance and balance issues.