How internet communication works: Network Coding

How internet communication works: Network Coding


In the late 1950’s and 1960’s people began
to experiment with the idea of computer to computer communication…. all information, no matter the form, is stored
in the computer’s memory as a sequences of bits. To send these bits they created hardware which
could transmit a clocked sequence of electrical pulses along a wire. On the receiving end the computer would capture,
store and convert the bits into the information they represent. Doing this with two computers, along a single
line, is quite simple. But of course, other people wanted to connect
their computers together, too. Leading to a computer network. One simple solution to make this network would
be to connect all machines to all other machines. but as you add more and more computers this
requires an impractical amount of line…plus it’s incredibly wasteful. Telephone networks at the time took a more
practical approach, individuals were connected to a Hub (such as a city switchboard) which
was a point through which many people could connect. At this hub would be an operator with a switchboard
would manually patch people together by closing a circuit between them (known as a circuit
switched network). So if PERSON A wants to connect to PERSON
B, they both get patched onto the long distance line through the hub. the drawback of this method is nobody else
in the network can use the long distance line until they finish their conversation. We call this line the bottleneck of the network. By 1972 the designers of what we now call
the internet were building ARPANET. It began as a small network of computers at
universities on both sides of the country connected together. So they came up with a clever way of weaving
computer conversations together. by breaking up all digital messages into chunks
known as packets. The job of the hub is simple: Send packets
as they arrive from various sources on a first come, first served basis. each packet labeled by its source and destination On the receiving end this weaving is undone
and the packets are put back into their original sequence. Effectively squeezing multiple conversations
along the line at the same time… New cities were connected to hubs and those
hubs were connected to each other. So more and more digital data needed to get
squeezed down these long distance lines. But there is a fundamental limit to how many
bits you can squeeze down any line per second. This limit is known as the line capacity (or
Bandwidth). And as you add more and more and more households
to the network, then you’ll run into a problem we’ve all experienced in traffic….a waiting
line develops (waiting line visual) If you are watching a streaming video you
will notice these missing packets as unwanted pauses in your stream. To see exactly how this happens, let’s do
a simple example and ignore all the people on the internet except one household. Imagine this household is watching a video
stream from Washington upstairs and downstairs someone is watching an audio stream from California. The video packet could take THIS pathway and
at the same time the audio packet could take THIS pathway. No problem. But now let’s add another household to the
network, in a separate city. And assume they also want the identical audio
and video stream. Perhaps they are popular shows. They have direct a pathway to the music source,
but they also need the middle path for their video source. This causes a waiting line, or queue, to develop,
so the hub will transmit two packets down the line, one after the other. This is what we mean when we say the network
is ‘slowing down’ – people begin to wait for their packets due to queues which develop
at the hub. so The BIG engineering problem today becomes,
how can we make these queues at our bottlenecks smaller? In the year 2000 a seminal paper was published
(Ahlswede et al) and it’s based on a clever strategy for what happens at the nodes in
a network… Instead of treating packets like cars, and
sending them one at a time. The idea was to treat packets as numbers. And blend them together using mathematical
operations, such as addition. Resulting in a new packet which contains a
mixture of information from more than one packets. Think of these mixed packets as providing
clues about what’s in them Let’s return to our example to see exactly
how it works… Let’s say the video packet being sent can
be represented by the number 1. And the audio packet can be represented by
the number 4. When the video packet hits the first hub,
it’s the only one to arrive so it’s sent unchanged along all the output lines. Same thing happens when the music packet when
it hits the first hub. But now look at the middle up, it receives
2 packets, and this time it adds them together and sends the result as a single packet. Notice now, both households receive their
packets just as fast for each household as if they were alone on the network! Finally the key step is to look at how we
can unmix these mixtures on the receiving end. Let’s look at Household 1 first. It receives the video packet directly. to extract the music packet out of the mixture
it needs to SOLVE for the missing value, by doing simple math we all learn in school. We know the Video packet=1 And the Video + Music packet=5 Therefore, The Music packet equals 4. And that’s the great insight of this paper. We treat each mixture as an equation, the
receiving computer solves for the variables, or packets, it needs. This is in sharp contrast to how the internet
was originally designed. Because it allow smultiple packets to travel
the same line at the same time, instead of having to get each packet in line and wait
its turn And this system practically works because
computers can now do basic arithmetic operations needed to solve these linear equations incredibly
fast, millions of times per second By 2020 it is expected that ~75% of network
traffic will be video streams and the traffic from mobile phones alone will be 30 exabytes. And we increasingly rely on our communication
networks for services that are considered fundamental such as health, finance, education… Therefore solving problems related to network
congestion is critical. And network coding is a great approach because
it allows us to mix information packets instead of having to get each packet in line and wait
its turn. This video presented the basic principle of
network coding in a simplified manner, but there still exist many open questions, as
well as new applications to explore. Such as wireless networks, security, distributed
computing and storage. An active research area in information theory
today is how to best leverage this paradigm shift to network coding as we build our digital
future. END

42 thoughts on “How internet communication works: Network Coding

  1. Unfortunately, this feels like a really impractical solution. It assumes there is more than one way to get information to a destination. And how do the routers know the receiver knows part of the mixed information, allowing to undo it? What if one of the packets gets lost?

    I believe a more practical approach is distributed networking, like peer to peer applications. I actually recently came across an open source project called cjdns, that allows to create a completely encrypted, redundant, peer to peer network, using either wired connections or tunneling over existing internet infrastructure. The protocol has a few scaling issues, but I imagine a futuristic internet would look somewhat like that – connect to the two nearest nodes (via some wireless protocol), and be connected to everybody while you forward traffic for other people.

  2. Another issues is, that this method seems to rely on content being multicasted, which today it is not. AFAIK multicast is not even allowed on the Internet (your ISP won't route multicast traffic into the Internet).

  3. How will the router undo the encoded packets, if the non-encoded packet fails to arrive? Where is this ever used? What type of network?

  4. Lots of issues to contend with here.

    Lost/corrupted/dropped packets seem like a big concern with this concept, since they would simultaneously impact multiple data streams negatively. The working scenario seems beneficial, but you have to think about how bad the situation could get when things start to break down.

    Additionally, there would be practical limits to combining packets. Say your data traverses 10 routers/hubs. It would be possible that packets could be combined multiple times along the way, requiring multiple excisions to get back to your original packet. This could easily come to take longer than waiting for your data to pass through the bottleneck.

    Then there's security; today, if someone wants to obtain my enciphered data, they have to find a way to become a man-in-the-middle to snoop the packets. But with this method, my secure data could become combined with someone else's data, with the resultant packet coming to both myself and that other. After excising his data, he could hold onto my data instead of discarding it, resulting in potential data breach without any need to setup MITM.

    I'm sure there are a lot of other issues that have to be thought through, carefully. Just a few off the top of my head.

  5. What software do you use for the animations?

  6. That's really clever idea. Taking advantage of the fact that you can increase information without increase information size.

  7. I feel like this video doesn't go into enough detail. There ought to be more than this. I imagine that instead of literally adding the two packets together, which could produce a carry, you XOR them. Secondly, does this mean that you have to rely on the fact that other people using the same connections will happen to download the same content as you do, if not all the time, then at least most of the time? I feel like such an assumption would at least need some data to back it up.

  8. But how recieving node will understand what package was used for addition? I mean middle node can sum packages that won't be at the recieving end. For example if there will be third household, that request package "7". Middle node sum it and get 11 (7+4). But recieving end has only package "1", and result in package 10. Сompletely wrong package.

  9. Can we get a more technical explanation or in depth overview on how this works? As presented it seems to be missing many key details including:
    How does the key to decode each mixed packet arrive at the host?
    What happens when packets are lost in transit or corrupted?
    Isn't this really just a bandaid for poor design? If the internet of old is starting to show signs of being unscalable, perhaps the real solution is to look at distributed networks?
    What about security, isn't this really dangerous?
    Is this HTTPS compliant? (seems like no?)

    imo the biggest flaw seems to be security. In a world where intelligence agencies dominate the lines companies are increasingly moving towards encrypting everything, but this theory seems fundamentally incompatible with the idea of unique discrete packets. The hub would cause the host to throw everything out because it would see this as tampering with the message.

  10. As a network engineer, I was confused, since this wasn't put out as a new idea, but instead was put forth as how things work today.

  11. your video is so good , i believe your channel will thrive like 3blue1brown , he worked in khan academy as well!
    best wishes
    you are so unique

  12. At the processor level this 'blending' concept is called hyperthreading, which has become a serious concern for security.

    For instance, after the correct data is extracted out at the receiving end, the data that wasn't intended for that source can also be extracted by calculating the difference. The result is data not originally intended for that source will leak privileged information.

    If the data includes the source info, this could be intintionally be used to create an untraceable 'Man in the Middle' attack by intentionally congesting the network, intercepting packets upstream, and generating malicious responses before the intended source can.

  13. The internet has allowed us to be in communication with every other planet Earth in our universe that has not yet become a black hole.

  14. Hi
    I would like to know how I can implement the intersessional network coding with the COPE protocol
    I use as simulator GNS3 on a network SDN

    thank you
    cordially

  15. I freaking love information theory! This is such an elegant explanation of it and I can’t wait to watch and learn more!

Leave a Reply

Your email address will not be published. Required fields are marked *