CIO Insider

CIOInsider India Magazine

Separator

Staqu to Offer Audio and Video Surveillance for Security

CIO Insider Team | Wednesday, 23 February, 2022
Separator

According to the reports, Atul Rai, co-founder and chief executive officer, Staqu Technologies, is planning to get the tender for a Lucknow smart city project for audio and video surveillance to improve security.

Rai has already a product named Jarvis, used by Uttar Pradesh Police and other state police forces, featuring closed circuit cameras (CCTVs) and artificial intelligence (AI)-based facial recognition.

Rai says, “we have used audio analytics to detect incidents such as prison fights in Uttar Pradesh. Our target is to implement it in smart cities.”

Reports suggest that, in new edition, Jarvis does not just use cameras to watch crimes happen; it also employs microphones to listen to what’s going on in the city. The audio analytics tool is also being used by organizations in retail and manufacturing to detect distress sounds and accidents.

“Every camera is capable of sending audio data using a mic. If a crime is being committed out of the field of view of this camera, audio can help in identifying if someone is in distress and needs help,” Raj adds.

The company is also working on a new natural language processing (NLP)- based feature that will allow users to ask Jarvis for information, prompting Jarvis to scan data across all the cameras

Staqu is one of the few companies in India that offer AI-based audio analytics tools. These systems can identify sounds like gunshots, a person’s scream or specific words that indicate distress. They use ‘convolutional neural networks’ (CNNs) to identify sound types. CNNs are typically used for image and video recognition, but here, they are used to discern patterns in sounds. Potentially, an audio surveillance system should be able to alert the nearest hospital if an accident occurs, or contact the police if a group of people are planning a crime.

Rai says, “Jarvis’ accuracy has been tested against VoxCeleb—one of the largest audio visual datasets for human speech. He claimed the system is 98.7 percent accurate. The company is also working on a new natural language processing (NLP)- based feature that will allow users to ask Jarvis for information, prompting Jarvis to scan data across all the cameras.”



Current Issue
WalkingTree Technologies: Championing AI & Robotics Technologies across Diverse Manufacturing Setups



🍪 Do you like Cookies?

We use cookies to ensure you get the best experience on our website. Read more...