Data science is huge playground. We have all heard about Big Data before, but most people don’t know it is defined by three rules, or the 3Vs:

1. Volume: a volume so big a single person or computer cannot analyze it;

2. Variety: many different data sources and definitions make it unstructured;

3. Velocity: it is being generated so fast it becomes more and more.

As a business, we strive for two more V’s on behalf of our clients:

1. Veracity: to understand the quality of data and how to support a client’s strategy;

2. Value: we process and analyze all this data to benefit the commercial success of our clients.

In this article, I’ll share how focusing on Veracity & Value in the lens of deploying a self-developed smart home improved my overall skillset as a data scientist.

A couple of years ago, my wife and I became parents and were looking for a new family home in the countryside where we could settle down and raise our children. We fell in love with an old farmhouse that was lovely, affordable and in a condition we believed we could refurbish ourselves.

I have always been a tech guy, so I wanted this old house to become modern and data driven, too. Unlike modern smart homes, this home had to be retrofit with smart home elements. To preserve the charm of the old house, I had to apply the elements discreetly and carefully. Eight years later I am still learning, but have discovered a few key pointers.

Solutions need to be functional! No one living in a smart home will accept anything less. The ability to set the atmosphere in a room by controlling the indoor lighting to complement the ambient light outdoors is a great effect – but this effect is dampened significantly if the light doesn’t switch on immediately as your system attempts to match the ambient outdoor light. A blink of an eye (~ 300 milliseconds) is an acceptable waiting period before users will become uncomfortably impatient starting to push the button over and over again. Thus, you need to take your outdoor ambient light measures in advance even though you sacrifice a little accuracy.

This applies to business practices, too. Your offering should be functional, integrated and across all channels in order to make impact and deliver a seamless experience for your customers. You should measure all relevant information, that compliments your users’ experience.

Value add also comes from understanding Volume and Velocity. One day my temperature sensor for the living room heating died. As smart home technology evolves quickly, the original sensor was out of production already and I had to find a new, compatible one. The new sensor gave me the temperature value in one second intervals and the resolution was 1/10th of a degree. As I already learned that heat controlling is a very slow business with high latency, I was wondering why I should measure the temperature every second and if I could even separate 5/10th of a degree in room temperature. The new sensor produced a lot of data points. 3600 data points per hour compared to just 4 per hour with the old one. At first, I observed that the graphs of the plotted room temperature looked smoother and nicer at any time scale, but I still felt that this was a massive overhead for little to no benefit. While playing around with the graphs I suddenly realized I could see little spikes whenever the TV was on. This caught my attention and soon I realized that the change of resolution from 1 degree to 1/10th degree in combination with an increased interval of measurement enabled me to not only measure the room’s temperature but also the waste heat of a person.

You might know from parties, birthdays or family gatherings at home that once there are many people in the same room it quickly becomes uncomfortably warm. This is due to the fact that humans emit about 1 kilowatt per hour of heat. So, having ten people in the room is equal to having a 10-kilowatt device heating the room. For reference, an open fireplace emits about up to 12 kilowatts per hour.

It didn’t take long to write an algorithm to analyze the number of room occupants by measuring the room’s temperature in the higher frequency and better resolution. It certainly does not work in real time as even little changes in temperature can only be measured with delay and it also does not cover cases where you rush in and out of the room. However, it is a nice use case to prove how you can take seemingly unrelated data events to make an indirect but correct assumption for something completely different than measured. I was able to add value to a previously less valuable measurement. Meanwhile I can measure the number of people in the living room with 95% accuracy within 15 minutes of entering the room.

For my work at Kinesso, I use these kinds of insights almost daily. The technology and products we develop shall be smooth and easy to use, meet expectations of our users and integrate seamlessly with existing tech stacks, workflows and all channels involved. When exploring data for our clients we try to find the right data points in the best possible resolution and frequency to add value and support our clients’ strategies to become commercially more successful. Bluetooth Beacon data can be very useful to understand who is passing by a DOOH location, but only in combination with other data such as a laser scanner measuring people crossing the street plus eye tracking software to reveal who really watched a certain DOOH advertisement and who may not have.

In my spare time I continue to perfect my smart home, which delivers 1.6 million data points per day from almost 500 sensors and actuators. Meanwhile my family only notices we live in a smart home when we visit friends and family living in a traditional house missing smart features.