Big data networks must be built to handle distributed resources that are simultaneously working on a single task. That’ll require network resiliency, consistency and application awareness.
When we think about big data, we associate it with the term big, but when it comes to building infrastructure, we should also be thinking distributed.
Big data applications do, in fact, deal with large volumes of information that are made even bigger as data is replicated across racks for resiliency. Yet the most meaningful attribute of big data is not its size, but its ability to break larger jobs into lots of smaller ones, distributing resources to work in parallel on a single task.
When you combine the big with a distributed architecture, you find you need a special set of requirements for big data networks. Here are six to consider:
1. Network resiliency and big data applications
When you have a set of distributed resources that must coordinate through an interconnection, availability is crucial. If the network is unavailable, the result is a discontiguous collection of stranded compute resources and data sets.
Appropriately, the primary focus for most network architects and engineers is uptime. But the sources of downtime in networks are varied. They include everything from device failures (both hardware and software) to maintenance windows, to human error. Downtime is unavoidable. While it is important to build a highly available network, designing for perfect availability is impossible.
Rather than making downtime avoidance the objective, network architects should design networks that are resilient to failures. Resilience in networks is determined by path diversity (having more than one way to get between resources) and failover (being able to identify issues quickly and fail over to other paths). The real design criteria for big data networks ought to explicitly include these characteristics alongside more traditional mean time between failures, or MTBF, methods.
E. Unicast Routing Protocols
There are two kinds of routing protocols available to route unicast packets:
Distance Vector Routing Protocol
Distance Vector is simple routing protocol which takes routing decision on the number of hops between source and destination. A route with less number of hops is considered as the best route. Every router advertises its set best routes to other routers. Ultimately, all routers build up their network topology based on the advertisements of their peer routers,
For example Routing Information Protocol (RIP).
Link State Routing Protocol
Link State protocol is slightly complicated protocol than Distance Vector. It takes into account the states of links of all the routers in a network. This technique helps routes build a common graph of the entire network. All routers then calculate their best path for routing purposes.for example, Open Shortest Path First (OSPF) and Intermediate System to Intermediate System (ISIS).