Pinpointing Influence in Pinterest
Ok… I am exaggerating. Juju did not make the deadline, Panagiotis and his co-authors with their hard work made the deadline. Ah, ok you caught me lying again… One of Panagiotis co-authors (me) did not work hard, actually he did not work on the paper at all! Normally, I would publicly apologize, but let me first explain why I am in the author’s list.
Panagiotis is a PhD researcher at the University of Athens. He is also a member of the Madgik lab, which is where I know him from. Panagiotis is interested in Graphs, so he is into Apache Giraph and GraphX for Apache Spark. Knowing how frustrating his work may get, I felt obliged to introduce him to Juju. The goal was to save him some time from deploying and configuring infrastructures and have him focus on real work.
Juju offers an often overlooked feature that proves to be immensely useful to researchers and people who just want to experiment with some software without committing to it. You can deploy an infrastructure on your local machine in less than 10 minutes, take it for a test drive, and then when you are happy, move to a cloud and test at scale. In the case of the Madgik lab, where Panagiotis is, getting cloud resources includes contacting the IT department and wait for them to find time and resources. I think I saw a spark in Panagiotis eyes when I showed him my laptop, or was it the reflection of the Spark infrastructure running in LXC containers, I don’t remember. He immediately showed me a Spark deployment of his own with a bunch of worker nodes that took him a week to set up. After all this time, he was still not sure if that configuration was appropriate for running tests and publishing results. What if he had misconfigured something? What if a minor config change (e.g.: cache size) would skew the results? That is not to say Juju has a magic way to optimally tune the infrastructure for your needs, but we try to choose sensible configuration variables based on the feedback we get from the community.
Panagiotis office looked chaotic, computer towers, screens and keyboards all over the place. Funny how he had prepared a VM for us to work on! Setting up Juju was flawless. Deploying a mapreduce processing bundle and scaling it within the limits of 48GB of RAM and 16 cores was a piece of cake. We even double checked that after a reboot all nodes would come up. All this under the eye of George, the head of the lab’s IT. After an hour or so, we said our goodbyes.
A few weeks later, I received an email titled SocInf 2016 submission and my name in the author’s list. Surprised, I asked, “Why?” The response was that, “without Juju we wouldn’t have made the deadline!”
So… you will find us at the 2nd International Workshop on Social Influence Analysis, co-located with the International Joint Conference on Artificial Intelligence (IJCAI 2016) in NY on the 9th of July. The paper is titled Pinpointing Influence in Pinterest. Kudos to Panagiotis Liakos, Katia Papakonstantinopoulou, Michael Sioutis, Alex Delis and the Juju Big Data team.