NetworkX is a Python package for complex graph network analysis. In order to understand NetworkX functionality, you first need to understand graphs. Graphs are mathematical structures used to model many types of relationships and processes in physical, biological, social and information systems. A graph consists of nodes or vertices (representing the entities in the system) that are connected by edges (representing relationships between those entities). Working with graphs is a function of navigating edges and nodes to discover and understand complex relationships and/or optimize paths between linked data in a network.
There are many uses of graph network analysis, such as analyzing relationships in social networks, cyber threat detection, and identifying the people most likely to buy a product based on shared preferences.
In the real world, nodes can be people, groups, places, or things such as customers, products, members, cities, stores, airports, ports, bank accounts, devices, mobile phones, molecules, or web pages.
Examples of edges, or relationships between nodes, include friendships, network connections, hyperlinks, roads, routes, wires, phone calls, emails, “likes,” payments, transactions, phone calls, and social networking messages. Edges can have a one-way direction arrow to represent a relationship from one node to another, like if Janet “liked” a social media post of Jeanette’s. But they can also be non-directional, like if Bob is a Facebook friend of Alice, then Alice is also a friend of Bob.
NetworkX nodes can be any object that is hashable, meaning that its value never changes. These can be text strings, images, XML objects, entire graphs, and customized nodes. The base package includes many functions to generate, read, and write graphs in multiple formats.
NetworkX has the capacity to operate on very large graphs with more than 10 million nodes and 100 million edges. The core package, which is free software under the BSD license, includes data structures for representing such things as simple graphs, directed graphs, and graphs with parallel edges and self-loops. NetworkX also has a large community of developers who maintain the core package and contribute to a third-party ecosystem.
Among the principal uses of NetworkX are:
- Study of the structure and dynamics of social, biological, and infrastructure networks
- Standardized programming environment for graphs
- Rapid development of collaborative, multidisciplinary projects
- Integration with algorithms and code written in C, C++, and FORTRAN
- Working with large nonstandard data sets
NetworkX is considered relatively easy to install and use, particularly for Python developers.