Opening Talk


Jose A. Ortega Ruiz – BigML


Jao co-founded BigML in 2011. He has been involved since the 90s in Free Software projects and programming languages (GNU MDK, Geiser, xmobar…) He holds a Ph.D. in theoretical physics (Barcelona) and BS computer science (Madrid), turned to computers in the 90s working for telecoms and aerospace (Indra), AI and web security startups (Isoco, Scytl), our internet overlords (Google), next-generation user interfaces (Oblong), with an interlude in academia teaching programming and networks. He still loves Scheme, books and he is learning to play piano.
 
 
Predictions are hard (and non-relational)
The opening talk is about our experience in BigML, a startup creating a machine learning platform geared towards distributed processing of big data in the cloud, offered as a SaaS platform for the non-initiated. We’ll discuss our technology choices and how they’ve been influenced not only by technical constraints, but also by the need to quickly evolve and transform the system as we grow to understand our problems and their solutions better. We’ll see how adopting NoSQL from the very beginning has helped us, and how, in some ways, it has hindered our efforts, in the context of the decisions on what to take and what to leave out from the “standard” and buzzword laden “big data” software ecosystem.
 
view the slides
 

 
 
 

Use Cases

David Arcos – Catchoom


David Arcos is a Senior Backend Engineer at Catchoom. He’s a Python developer interested in scalability, security and distributed systems.
 
NoSQL matters in Catchoom’s image recognition Platform – Use case of NoSQL on the Catchoom Platform
This presentation will explain the architecture of the Catchoom Platform, emphasizing on the advantages of using NoSQL for the specific requirements of visual recognition. However, David will also admit the limitations of NoSQL, explaining where Catchoom is still using old-fashioned SQL.
 
view the slides
 

 
 
 

Pablo Enfedaque – Telefónica R&D


PabIo is a SW engineer at Telefónica Product Development and Innovation (formerly Telefónica R&D). During the last two years and a half I worked as Technical Leader of the Personalisation Server.
 
From Oracle to MongoDB, a real use case at Telefónica R&D
The talk will cover the use case of the Personalisation Server, a master customers profile storage for the companies of the Telefonica Group (Telefonica, O2…). It provides real-time (ReST API) and batch interfaces to update, retrieve and share customers profile. Initially the PS used Oracle, but due to scalability and cost issues we implemented a new version with MongoDB.
In the talk we will see the problems that made us move to MongoDB and all the benefits that we obtained (with real performance figures, ofc).
Right now the Oracle version is being used at UK and Ireland (aprox.
30M user profiles stored) and the NoSQL version is being deployed at Mexico (18M customers) and other Latam countries.
 
view the slides
 

 
 
 

Marc Pous – BDigital

Marc Pous is a member of mobility research group at Barcelona Digital Technology Center (BDigital) and he is currently leading the technical activities in different research project related with smart cities and urban computing.
 
The real-time Barcelona urban mobility with NoSQL technologies
In this talk Marc proposes two complementary ways to exploit the human sensing capabilities in order to understand their mobility patterns in an urban environment; namely, the information extraction of social networks and the acquisition of more accurate position information, recommendations and network optimizations through the use of gamification techniques. Moreover, we describe our experience and barriers encountered when deploying our solution using NoSQL technologies (based on documents and graphs) with the goal of understanding urban processes such as the dentification of transitory areas of activity and mobility patterns in the public transportation network (PTN).
 
view the slides
 

 
 
 

Marc Sturlese & Dani Solà – TROVIT SEARCH SL


Dani Solà is a Backend Engineer at Trovit, where he works in the Hadoop-Lucene team. He’s interested in distributed systems, data mining and search.
 

Marc Sturlese is a Backend Engineer at Trovit. He has been working with Hadoop, Lucene and other technologies to develop a solution that allows Trovit to grow without having to worry about scalability.
 
Trovit – From legacy, to batch, to near real-time
In this talk we will present our transition from an old legacy system built on top of MySQL and a bunch of PHP scripts to a system based on Hadoop and Hive, which allows us to process and keep stats of hundreds of thousands of ads several times a day and index them to be served to the end users. We will also present our current undertaking to deliver the ads from our sources (property portals, job boards…) to our visitors in near real-time using Storm, HBase and Zookeeper.
 
view the slides
 

 
 

Products

Chris Anderson – Couchbase


Chris Anderson is a co-founder of Couchbase and an Apache CouchDB committer. At Couchbase he leads the mobile strategy, and designs and implements developer facing APIs. Outside of computing, Chris plays bass and dances “Ring Around the Rosie” with his 1 year old daughter.
 
Grid Computing with Couchbase
A common problem in large scale computing, is coordinating workers when they can be scattered across thousands of compute nodes. For workloads like this, atomic operators like increment and decrement reduce contention between distributed processes.In this talk I’ll show a full text analysis tool which ranks words in the Twitter firehose. By storing each token in a key based on its characteristics, we can provide word rankings both globally, as well as over time and space.For instance a tweet in English from San Francisco might say “Go Giants” so counters for 2012:go and usa-sf:2012-07:giants (among a few dozen others) are incremented. Even using memory like this, the counts from a full corpus of English text would only take a few gigabytes to hold.
The Twitter stream analysis can be run on any number of worker nodes, all connected via Couchbase Server. Attendees will have the opportunity to help contribute to the data gathering, and will be able to see data mining results in realtime.
 
view the slides
 

 
 
 

Michaël Figuière – DataStax


Michaël is an engineer and a developer advocate at DataStax where he actively works to improve Cassandra. At ease with both Enterprise Java and lower level technologies, he specializes in distributed architectures and topics such as NoSQL, search engines, and data processing. He often speaks about NoSQL in french Java User Groups and loves to write about his favorite topics.
 
Real-time Big Data in Practice with Cassandra
Big Data is a fast growing trend in enterprise applications that comes with a novel promise compare to past technological revolutions: being able to retrieve and manipulate as much data as necessary to bring up new use cases or to improve the user experience. This is made possible thanks to a set of new technologies and databases that are designed to handle any scale.
This presentation will use Cassandra to implement several use cases that follow the big data way of thinking together with a real-time approach: business and web analytics, storing a click stream or a news feed, and more…
 
view the slides
 

 
 

Christian Kvalheim – 10gen

Christian is original creator of the Node.js Driver for MongoDB. He currently maintains the Node.js Driver as a Developer Evangelist for MongoDB at 10gen.
 
MongoDB 2.2 and Big Data
MongoDB’s 2.2 release includes a wide array of feature improvements for a big data scenario. In this talk, we’ll walk through three new features that improve MongoDB’s performance with Big Data including the Aggregation Framework, Time to Live Collections, and Tag-aware sharding. We’ll walk through real-live scenarios for where and how to utilize these features for maximum performance.
 
view the slides
 

 
 

Martin Schönert – triAGENS

Martin SchoenertThe Cologne developer Martin Schönert first heard about NoSQL at the age of… well, we can’t tell exactly. But it seems, that he soak up the whole thing with the mother’s milk. Being in business for over 30 years, Martin has worked as project director, developer, product manager, technical manager, in enterprises from Cologne to the US, from startups over IBM to Deutsche Post AG.
 
ArangoDB – The universal database
This talk will give an introduction to ArangoDB. ArangoDB is a new open source NoSQL database, whose goal it is to be a powerful universal database. It allows flexible data modelling: as key-value pairs, documents or graph data. It allows complex queries through an elegant query language. An embedded JavaScript interpreter allows you to add functionality to the database for transactions, complex manipulations of the objects, treating AvocadoDB as a complete application server – database stack. ArangoDB guarantees durability through MVCC with append-only journals and zero-administration availability through synchronous replication. It utilizes main memory and SSDs to deliver maximal performance. Even though ArangoDB is still a young project, it has already attracted international attention, especially in Japan and the USA. The community is currently working on APIs for Python, D, Ruby, Java and node.js.
 
view the slides
 

 
 
 

Technical Aspects

Pavlo Baron – codecentric AG


Pavlo Baron is lead architect with codecentric AG. His passion are distributed systems and large data sets – the infrastructure behind what they call Big Data. Pavlo is frequent conference speaker and has written three German books: “Erlang/OTP”, “Pragmatic IT Architecture” and “Fragile Agile”. He is already working on his next book “Big Data for decision makers”.
 
Dynamo concepts in depth
Amazon Dynamo paper has desrcibed a special use case oriented and actually working in practice subset of ideas and concepts which are as old as the Bavarian forest. I will explain where they came from, reference the origins, and also explain these concepts in technical depth as well as why this extract from the distributed systems theory was necessary for its special use case.
 
view the slides
 

 
 
 

Luca Garulli – Nuvolabase Ltd.


Living between Rome and London, Luca became the CEO of NuvolaBase Ltd, the company behind the OrientDB – NoSQL Open Source project . Despite his young age, the 36 year old author of the Roma Meta Framework project quickly became renowned within the European NoSQL scene and also became member of the Sun/Oracle Expert Group for JSR 12 and 243.
 
Switching from the relational to the graph model
In the last 30 years the Relational DBMS dominated the way to design and manage the persistence of information. Few years ago the NoSQL movement brought alternative approaches to store data. This session will analyze the Graph model and how can be successfully used by people with a Relational skill.
 
view the slides
 

 
 
 

Matt Heitzenroder – Basho

Matt HeitzenroderMatt joined Basho Technologies in 2010 and currently is responsible for opening and operating Basho’s first international office in 2012, based in London. Previously he had worked for five and half years with SugarCRM in client service and engineering roles. For more than a decade, Matt has participated in the launch of a number of venture backed start-ups. Matt is a developer at heart and you can often find him hacking away on various Erlang projects in his free time. And if he’s not doing that, he’s probably somewhere at sea either sailing or scuba diving. Everybody just calls him “Roder”.
 
Eventual Consistency in the Real
Systems requiring very high availability may sometimes make the compromise known as “”eventual consistency”". This trade-off introduces the intimidating prospect of application-level conflict resolution, which requires a shift in thinking but is not as problematic as it may seem. Most examples of conflict resolution in such systems are trivial; the canonical example being the set union of shopping carts. Systems requiring very high availability may sometimes make the compromise known as “eventual consistency”. This trade-off introduces the intimidating prospect of application-level conflict resolution, which requires a shift in thinking but is not as problematic as it may seem. Most examples of conflict resolution in such systems are trivial; the canonical example being the set union of shopping carts. This talk considers that many businesses already know how to handle eventual consistency, and provides examples of conflict resolution taken from the real world.
 
view the slides
 

 
 
 

Alexandre Morgaut – 4D

Speaker Mitch PirtleAlexandre is NantesJS creator, Web Developer, Web Architect, and Community Manager at 4D, he gives presentations about JavaScript, the Server-Side JavaScript, NoSQL, and Wakanda. Alexandre participates in Web Standards mailing lists, makes technical recommendations to anything related to HTML, JavaScript or HTTP, and works to promote JavaScript as a professional language.
 
NoSQL & JavaScript: A love story!
You may all know that JSON is a subset of JavaScript, but… Did you know that HTML5 implements NoSQL databases? Did you know that JavaScript was recommended for REST by HTTP co-creator Roy T. Fielding himself? Did you know that map & reduce are part of the native JavaScript API? Did you know that most NoSQL solutions integrate a JavaScript engine? CouchDB, MongoDB, WakandaDB, ArangoDB, OrientDB, Riak…. And when they don’t, they have a shell client which does. The story of NoSQL and JavaScript goes beyond your expectations and opens more opportunities than you might imagine… What better match could you find than a flexible and dynamic language for schemaless databases? Isn’t an event-driven language what you’ve been waiting for to manage consistency? When NoSQL doesn’t come to JavaScript, JavaScript comes to NoSQL. And does it very well.
 
view the slides
 

Comments are closed.