Want to learn more? Take the full course at [ Ссылка ] at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.
---
Hi! My name is Eric, and I am a Data Scientist working at the intersection of biological network science and infectious disease, and I'm thrilled to share with you my knowledge on how to do network analytics. I hope we'll have a fun time together!
Let me first ask you a question: what are some examples of networks? Well, one example might be a social network! In a social network, we are modelling the relationships between people.
Here’s another one - transportation networks. In a transportation network, we are modelling the connectivity between locations, as determined by roads or flight paths connecting them.
At its core, networks are a useful tool for modelling relationships between entities.
By modelling your data as a network, you can end up gaining insight into what entities (or nodes) are important, such as broadcasters or influencers in a social network. Additionally, you can start to think about optimizing transportation between cities. Finally, you can leverage the network structure to find communities in the network.
Let’s go a bit more technical. Networks are described by two sets of items: nodes and edges. Together, these form a “network”, otherwise known in mathematical terms as a “graph”.
Nodes and edges can have metadata associated with them. For example, let’s say there are two friends, Hugo and myself, who met on the 21st of May, 2016. In this case, the nodes may be “Hugo” and myself, with metadata stored in a key-value pair as “id” and “age”. The friendship is represented as a line between the two nodes, and may have metadata such as “date”, which represents the date on which we first met.
In the Python world, there is a library called NetworkX that allows us to manipulate, analyze and model graph data. Let’s see how we can use the NetworkX API to analyze graph data in memory.
NetworkX is typically imported as nx. Using nx.Graph(), we can initialize an empty graph to which we can add nodes and edges. I can add in the integers 1, 2, and 3 as nodes, using the add_nodes_from() method, passing in the list [1, 2, 3] as an argument. The Graph object G has a .nodes() method that allows us to see what nodes are present in the graph, and returns a list of nodes.
If we add an edge between the nodes 1 and 2, we can then use the G.edges() method to return a list of tuples which represent the edges, in which each tuple shows the nodes that are present on that edge.
Metadata can be stored on the graph as well. For example, I can add to the node ‘1’ a ‘label’ key with the value ‘blue’, just as I would assign a value to the key of a dictionary. I can then retrieve the node list with the metadata attached using G.nodes(), passing in the data=True argument. What this returns is a list of 2-tuples, in which the first element of each tuple is the node, and the second element is a dictionary in which the key-value pairs correspond to my metadata.
NetworkX also provides basic drawing functionality, using the nx.draw() function. nx.draw() takes in a graph G as an argument. In the IPython shell, you will also have to call the plt.show() function in order to display the graph to screen. With this graph, the nx.draw() function will draw to screen what we call a node-link diagram rendering of the graph.
The first set of exercises we’ll be doing here is essentially exploratory data analysis on graphs. Alright, let’s go on and take a look at the exercises!
#DataCamp #PythonTutorial #NetworkAnalysisinPython
Ещё видео!