“Computing in the Very Large” Seminar

I recently attended a seminar/talk by Ariel Rabkins who received his PhD in Computer Science from UC Berkeley. His main talk was on a programming model called MapReduce which was  created by Google and used by big companies like Google and Facebook to store and process large quantities of data. These companies are known to store and process petabytes of data on a daily basis and the question raises on how do you manage these large quantities of data without system malfunction. Well this is where MapReduce comes to play.

There is two parts to MapReduce:

1. Map: First, there is a master node which takes the large quantities of data and divides it into small amount or sub-problems. From there, these worker nodes processes the data and returns it back to the master node.

2. Reduce: This part involves the master node in collecting the solutions from the worker node and merging it into one solution that was originally specified.

I really enjoyed this seminar because he showed us how the MapReduce was programmed in Java and currently I am taking CSE 017 which deals with Programming and Data Structure. I was able to make the connection between how large data can sometimes be complicated to handle but with the right programming tools, it can be done easily.


About lunur16

I am undergraduate student at Lehigh University. I am from Allentown, Pennsylvania. My intended major is Computer Engineering and possibly a minor in Business.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s