Video Analytics

The promise of AI for Retail

The inertia of change is difficult to overcome, and today’s retailers are being challenged to change and accommodate the evolving expectations of shoppers. Shoppers are being offered conveniences of online shopping, personalized experience and unlimited choice of merchandise through digital channels. It is therefore an existential need for the retailers to enhance shopper experience.

Ecommerce share of retail sales is about 15% today (2020), though the share is increasing steadily over the years. It is therefore, still possible for retailers to reverse the trend and cover some lost ground. The rapidly advancing landscape of AI and Machine learning has the potential to help retailers improve the shopper experience.

Finding a way to beat digital convenience

It is obvious that purchasing an item online is far more convenient than driving to a retail store and making a purchase. Retailers need to understand that they do not need to beat e-commerce. They need to find ways to make shopper experience enriching enough to make the shopper willing to drive to their store. For retailers, AI and Machine Learning driven shopper experience enhancement promises to provide them with the opportunity to bridge the gap between digital and physical stores.

Smart Mirrors: Smart mirror technologies enables retailers to collect data on the products being tried and requests for different sizes or colors. It can also help retailers identify product pairings for promotions and bundling. The shopper’s details can be utilized to understand shopper preferences. Shopper’s browsing details and merge it with the purchase data to have better sense of the shopper.

Personalization: Retail spaces could use the biometric recognition capabilities of AI to extract shopper profiles, customized rewards and promotions – creating a personalized experience for the shopper. At Xtage Labs, we have created a basic proof of concept to show how AI could be utilized to identify the shopper and combining CRM, purchase database and loyalty card details to create custom promotions for each shopper

An important consideration though is the ability of retailers to secure sensitive biometric data and privacy concerns that may be counterproductive and discourage shoppers.

Extended Reality (XR): Extended reality (AR, VR and MR) promises to offer retailers with the capability to provide unique shopping experience that could change the face of retail. From browsing different variations like color or size of the products, to virtually “trying them on” or comparing two items together – XR promises to substantially improve the shopping experience

Real-time Stock Monitoring: Retailers could use camera feeds to monitor the sales velocity of their merchandise. Images of different sections of stocking units to generate counts in real-time. This will help retailers understand the product demand and anticipate shopper needs better.

Store layout as a Recommendation Engine: It is a well publicized fact that 1 in 3 purchases on Amazon are driven by recommendation systems utilizing deep learning based machine learning algorithms. Retailers could utilize a different machine learning algorithm to identify the products being bought together and optimize the store layout and put items bought together next to each other.

The future of retail is going to be more personalized. It is also important for retailers to understand that ecommerce is here to stay, and they need to find a way to enhance shopper experience. Retail store should provide experiences enabled through the use of AI and Machine Learning to excite customers and make them drive to the retail store.

Video Analytics

Profiling of shoppers using video feeds in an apparel retail store

The retail store chain client is facing challenges from e-commerce and other competitors. They are looking to take on the challenge and realise the importance of customer and shopper data. They know that the e-commerce players have access to browsing and other data that helps them engage with shoppers better. They want to setup a system that can help them capture similar (if not same) data from an offline retail perspective, enabling better decisions.

Dual use of video feeds to generate shopper insights

The client already had high-end CCTV cameras installed in their outlet for security purpose. They wanted to utilize the infrastructure to generate shopper insights.

Majority of shoppers are in the age range of 30-40 years

27% of the visitors are accompanied by companions or family

82% of shoppers interact with some product but do not make a purchase

The study revealed valuable insights on shopper behaviour inside the store and some additional insights on the shopper profiles that were previously unknown.

High cost of training. Data security risks

The retailer knows the importance of shopper data and The insights they can provide. They also know that in order to counter the ability of personalization on e-commerce platforms, they need to acquire additional data to build a similar solution. And given the offline selling model, they had a leverage that e-commerce platforms don’t have – opportunity to interact with the shoppers and understand their motivation, preferences and other insights.

The conceptualised solution had its share of challenges though. The most pressing was on data security due to privacy guidelines from the regulatory agencies to not store shopper biometrics. The second issue is with the shopper’s itself – who have a negative perception towards CCTV based intelligence. And lastly, the risk of the solution being misused due to security breaches.

Useful, but resource intensive

The piloted solution provided some valuable insights that are otherwise not available to the retailer.

For once, they knew the demographics of the shoppers through own data sources. The other benefit was knowing the products/items that shoppers interact with, most often.

However, there are some valid concerns on the commercial deployment:

a. The high cost (data infrastructure) of processing high quality video feeds as a data stream

b. Genuine concerns around regulatory compliance, data security protocols and biometric data usage

The client is currently assessing the risks vs. benefits and the legal implications of commercial deployment.

Video Analytics

A pilot study to count the number of vehicles passing by an OOH billboard

A digital Out of Home (DOOH) agency commissioned a pilot study to assess the benefits of using a computer vision based vehicle counts against the sampling based measurement techniques that they currently use. They wanted to look at the feasibility of the solution in terms of long term cost of equipment maintenance, light and weather related challenges and the cost of processing data and maintaining the data infrastructure.

Tracking and counting using Computer Vision

The DOOH client had commissioned the pilot to assess the benefits of using Deep CNN based vehicle count to measure the ad views compared to sampling based approach that they use.

The solution accuracy was impacted by light and weather conditions

Sampling based estimations had an acceptable 2.7% error

The operational and data processing costs outweigh the benefits

The study established that the deep learning based solution did not offer any significant benefits over the current practice of ad views estimates.

Challenges in training and variety of vehicles.

The digital out of home (DOOH) had read about the Deep CNN based methodologies to count the number of vehicles, and hence the ad views for their ads. They wanted to assess the feasibility of deploying such a solution on their OOH properties. They commissioned a pilot for 3 properties at strategically chosen locations to analyse the effectiveness of the solution and long term feasibility.

One of the major challenges was to collect enough data for training the Deep CNN algorithm. Due to different sizes of vehicles, the performance of the solution remains a challenge and affects the accuracy of vehicle counts. Another challenge was the overhead cost associated with the physical infrastructure to be maintained at every OOH property. The bandwidth and data processing requirement was another challenge.

A solution not ready for commercial applications

Deep CNN based vehicle count is a solution that looks good on paper and a demonstration of what artificial intelligence and machine learning solutions can do. However, the cons far exceed the pros for commercial applications – at least as of today.

a. The operational cost associated with acquiring equipment and the maintenance associated with it is prohibitive

b. The huge bandwidth needed to process high quality, streaming video data and generate counts

c. The accuracy of the algorithm – affected by lighting, weather and other adverse weather events

All the above, with no significant benefits over the current sampling based estimation methods that they currently use.

Video Analytics

Detecting potential shoplifting incidents in advance using AI

The retail brand has been facing the issue of shoplifting at the retail store. They have an active CCTV based surveillance infrastructure in place throughout the shop floors. However, the process is manual and the security team looks into the footage to identify shoplifters after a theft has been reported. They are looking for a predictive solution that can flag potential shoplifting through behavioural analysis of the body language and general patterns in clothing to conceal identity.

Identifying shoplifter behaviour

The client wanted to test if behavioural analysis of shoplifters – body language, movement patterns in store, clothing and actions to conceal identity can flag potential shoplifting.

Identified three out of ten shoplifting incidences pre-emptively

A high number of false alarms leading to operational difficulties

Scope of expanding the application for shelf space monitoring

The study was able to identify a low percentage of shoplifting cases. The high cost involved outweighed the benefits of the solution.

Long training time. High video resolution

The wholesale retail brand was losing thousands in thefts every month. In most cases, the act was spotted through CCTV camera feeds after a shoplifting incident had been reported. The manual inputs required to monitor such incidents when they are happening is not practical given multiple camera feeds and general tendency of humans to lower their guard for isolated incidents.

They were looking for a solution that could help them overcome these issues of human error. The client turned to computer vision based monitoring of feeds from multiple CCTV cameras across the floors and develop an algorithm to flag suspicious activities based on body language, clothing patterns to hide identity or general demeanour of shoplifters. Their objective was to flag sufficient number of shoplifting incidents before they happen.

Visible benefits with high risk of misuse

The Deep CNN based behavioural monitoring and flagging of potential shoplifting incidences provided promising results in the pilot.

With limited data and training, the solution was able to identify three out of every ten potential shoplifting incidents correctly.

There were certain downsides to the approach:

a. A significant number of false alarms

b. High cost of running the application 24*7

c. Concerns around security as the CCTV feeds also captured the biometric features of shoppers

The client realised the benefits and the risks associated with deploying the solution on full scale.

Video Analytics

Video Analytics: Are the possibilities lesser than what is being promised?

Video Analytics or more precisely, Video Content Analytics haven’t quite matched the hype. Beside the well-founded concerns of privacy and security, there are functional and non-functional challenges to a successful video analytics solution deployment.

The promise of machine learning and artificial intelligence driven video analytics only if all issues and concerns are addressed – and knowing them is the first step towards finding solutions to those concerns. We discuss some of them in this post.

a. False Alarms

Any machine learning or artificial intelligence based algorithm has an inherent error – commonly classified as Type I and Type II errors.

Type I Error: An event of interest has not occurred but the algorithm think it has

Type II Error: An event of interest has occurred but the algorithm fails to detect it

These errors create nuisance and a general frustration for the agency where the solution is being tested or deployed. A feeling of half baked solution with no understanding of when to react to an alarm and when to let it go may seep-in in the longer term

b. Impact of ambient factors

The performance of a video analytics solution is highly dependent on feed quality. If there are extreme weather events – heavy rains, sudden dark clouds, gathering of large number of people, a traffic jam etc. – the performance of the video analytics solution may falter. And as these conditions are some of the extreme cases for which video analytics solution has been sought – the trust may completely evaporate

c. Unsuitable use cases

This problem occurs because of the hype and mis-selling associated with machine learning and artificial intelligence based video analytics solutions. While not a problem with the solution itself, but unachievable promises by the selling party may lead to disillusionment and negative publicity for the solutions

d. Machine vs. Human Interpretation

Video Analytics solutions are based on machine learning & artificial intelligence algorithms. Machines process and understand a feed differently than humans do. Infact, unless explicitly programmed, these algorithms cannot interact with (or read emotional response of) the users to find out if it’s response to a query was useful.

e. Cost

The cost of an effective and intelligent video analytics solution suffer with huge upfront deployment cost and high maintenance cost. The infrastructure required to process all the feed generated through cameras is not cheap – despite all the progress made on cheaper bandwidth and databases

f. Privacy concerns

More than 80% of world population doesn’t live in China. Citizens have a say on their privacy rights and can punish governments if they feel the State is snooping on them at will. And being a responsible company, we should point out here that these concerns are not unjustified. There is a high risk of the solution being misused. And our hypothesis is, if something can be misused – it will be. It is therefore important for the agency to look carefully into the possible misuses – and create ironclad checks and balances

Video analytics can be a gamechanger – if implemented to bring effective change. While we have not found any study establishing the benefits of video analytics solution bring about safer societies e.g. Reduction in crime rates on implementation of a city surveillance system, we strongly believe in the potential of video analytics.

To be successful, the requirements should be well defined and apart from specific challenges, the concerns raised above should be given a careful consideration.