According to the famous Gartner Hype Chart, we are currently at the peak of AI hype. “Autonomous Vehicles,” “Machine Learning,” “Deep Learning,” “Virtual Assistants,” “Smart Robots” and many other terms in this chart are parts of the so called “AI revolution”. We at Futurice completely agree with the chart and think that the AI topic is currently overhyped. However, we are also excited about the future opportunities that all these AI technologies will bring when they reach the Plateau of Productivity.
There is no doubt that AI will shape all industries and parts of our lives, however in the current situation of media buzz it is often hard to get a clear understanding of what people mean when they say “AI”, “Machine Learning” or “Data Science” and how they all relate to each other. The problem gets worse when you need to have this clear understanding, not on an individual level, but rather as a multi-disciplinary team of developers, designers, and advisors working together to develop data-driven services.
The ultimate question becomes this: how can we enhance the level of AI knowledge and understanding for professionals with different backgrounds and expertise?
In our Berlin office we decided to hold our first internal Data Hackathon.
“Hackathon” by definition means that a group of people with particular expertise come together to work on a challenging task in a short timeframe. In our case, the situation was a bit more complicated, since our group consisted of various knowledge levels regarding AI. Some of our colleagues had never worked with data-driven services and only had a basic theoretical knowledge of AI. So, the first thing we did during the hackathon was learning some fundamental Machine Learning concepts that we could then apply straight away to real-world problems.
Speaking of which, nothing motivates the people of Futurice more than finding and solving real problems. Therefore, we didn’t use “good-but-still-theoretical” problems from the internet for our hackathon, rather we asked our clients for real problems they currently face.
The reasoning was pretty obvious: we get some real-life data challenges to solve and our clients get to keep the outcomes of the hackathon. As a side-effect, we get to test new ways of co-operation and co-creation, increase the trust between us and our clients and just have fun together.
It’s worth mentioning that the results exceeded all expectations and we managed to achieve that illusive “win-win” situation for all parties:
We used the name “Data Hackathon” and referred to tasks as “data problems”, because they simply did not fit under traditional Machine Learning problem definitions.
A large insurance company asked us to come up with a user-interface for one of their internal services. This interface would provide a way of collecting different data points from the end user in order to improve and retrain the classification model integrated in this application on the backend. The interface needed to be intuitive and fit well into the overall workflow of the end-user.
In less than two days our three designers, two data analysts and a data scientist from the insurance company came up with several strategies for the data collection. The insurer expressed the wish to continue working on the suggested solution after the Hackathon.
This is the perfect example of UI/UX design for Machine Learning. At Futurice we believe that in the near future every application development project will include the design of AI elements from day one, similar to the way that today application development is impossible to imagine without UI/UX design.
Oetker Digital on the other hand, asked us to experiment with personal voice assistants and develop one for their flagship product, the baking portal backen.de. There is no doubt that, with current advances in Natural Language Processing, new types of HCI (Human-Computer Interaction) such as voice and gestures will spread across different domains in the near future. Baking and Cooking websites are perfect candidates to pioneer new ways of such interactions.
During the Hackathon a mixed team of participants from Futurice and Oetker Digital developed a demo prototype using Google Voice Kit and Dialogflow service. The prototype was able to recognise more than 15 user intents and to provide a screen-free baking experience. The team conducted a successful first user test which was crucial to iterate the solution further.
As a result Oetker Digital got a quick-yet-functional prototype with an extensive list of possible problems and pain points to address while developing the voice assistant. We at Futurice have once more realized how difficult it is to design natural interaction flows and to imitate human behaviour. Without a doubt this experience will help us a lot during future projects involving voice interfaces.
After having two interesting but rather “data related” problems, we still wanted to get our hands dirty with actual Machine Learning modelling. For that purpose we decided to tackle two problems: Predict patients’ appointment booking behavior of a Medical Center. Futurice internal hour marking system.
Many organizations that deal with constant client flow face problems with appointment bookings. Human error factors turn simple meetings scheduling into a tedious task with high costs. Even though every domain area has its own nuances, the problem is rather general and follows similar patterns across domains.
We explored how data-analysis can solve appointment booking problems for a private medical center in Dubai. The overall costs of the clinic consist of many factors:
Once we saw the bookings administrative journal, it almost intuitively felt like a Machine Learning problem and a perfect training opportunity for the Hackathon.
|SL. No||Appt. date||Appt. time||Regn No||Doctor||Status||Remarks|
|301||6/2/2018||11:00||106553||G O||Billed||general checkup|
|302||6/2/2018||11:15||106305||M P||Billed||arrived 5 mins early|
|303||6/2/2018||11:30||106081||M E||Billed||Arrived 7 mins late, moved from 07/02|
|302||6/2/2018||11:40||106229||J Z||Billed||Arrived 30 mins late|
|305||6/2/2018||11:45||104966||M P||Billed||arrived 15 min early, FL, booked with permission|
|307||6/2/2018||12:00||101836||M E||Billed||arrived 10 min early, moved from 07/02|
|308||6/2/2018||12:30||106145||M P||Billed||arrived 5 min late, moved from 15/01|
|309||6/2/2018||12:30||100825||M E||Changed||arrived 40 min late and left|
|310||6/2/2018||12:40||106505||J Z||Billed||arrived 5 min early|
|311||6/2/2018||12:45||106168||M P||Billed||arrived 15 min late|
|312||6/2/2018||13:00||106569||M P||Billed||Arrived 25 mins late|
|313||6/2/2018||13:00||104885||M E||Billed||arrived 5 min early|
We decided to build a predictive model that would learn about every patient’s punctuality behavior over time and predict for their future visits the best resource allocation for appointments that would reduce costs for all parties. Our weapons of choice were: Python with pandas, matplotlib and sklearn libraries.
Our first aim was to learn as much as possible about the data at hand and understand the kind of problem that we are dealing with. Thus, we kicked-off the project with what data scientists like to call EDA (exploratory data analysis). This step provided us with key insights into our data and helped us identify problems such as whether or not we have missing values or errors.
Visualization was an important aspect of this step, so we spent a good chunk of our time building meaningful visualizations. The next step was to prepare our data, clean it and format it in a suitable way for the ML algorithms to be able to extract patterns from it. The biggest challenge was to perform feature engineering, basically using the current features to build new, informative ones. Finally, we reached the exciting step of applying ML algorithms and evaluating our work. We modelled this task as classification problem, specifically show/no-show of patients based on past appointments, so we tried a few ML classification algorithms such as SVM, Decision Trees, AdaBoost and XGBoost, aiming for at least 80% accuracy.
We experienced a steep learning curve on this project and these are some of our key learnings:
As for our last challenge, where we tried to come up with a predictive model, which would forecast hour markings for each employee based on her/his personal history. We quickly realized during the exploratory data analysis phase that work in IT consultancy agency is often so irregular that no ML algorithm could ever make sense of that hour marking data. Nevertheless, we applied lots of statistical data analysis and data mining tools using pandas, numpy and other python libraries and discovered interesting insights. From obvious ones, such as that most of our internal meetings happens… surprise, surprise... on Fridays (when we all work from our office and not from client’s premises), to more useful, such as that for some particular events the correlation between google calendar entries and entries in hour marking system are rather high, therefore we could use that data to build a classifier using logistic regression or any other model to automate some part of hour markings or just simply map people’s work calendars with hour markings system.
Main takeaways from this track were:
In the end, not to repeat ourselves stating how awesome the event was and how much fun, learning and working experience we received during this weekend, I’d like to quote feedback from our clients, who believed in our initiative and supported us providing problems and co-creating with us during the hackathon. For which we are really greatful <3
The feedback is posted without edits, as complete honesty and eagerness to improve is at the core of Futurice culture.
Data Scientist at a large insurance company:
Overall I liked the format, structure and spirit of the Hackathon a lot. What I liked especially:
- Clear agenda communicated upfront. Someone at the event who is taking care that timelines are met.
- Quite diverse projects. Participants could choose, based on their interests, what they worked on
- Good mix between breaks and working phases
What could be improved from my perspective:
- The internal presentations were good, they were both very interesting and well prepared, but I think some contribution from an external expert would also be beneficial.