DIY Detectors: One-Click, User-Generated AI Models

Carter Maslan
Camio
Published in
4 min readApr 23, 2023

--

DIY Detectors with Segmented Object Tracking Auto-extraction (SOTA) enable one-click creation and deployment of new user-generated AI models.

Teach us to fish

Security systems have historically touted features like motion detection, line crossing, door alarms, people detection, dwell times, vehicle counting, etc… Then customers find gaps in scenario coverage or quality only to demand that those vendors add more features. That continual loop of feature requests is really a symptom of a problem in the way that these systems have been developed.

If security systems instead become self-serve — where end users use and create the AI that gives them ability to direct attention to the risks and objectives applicable to their own environment — then we’ll see breakout productivity gains and cost savings. That transition from receiving point solutions to creating continual improvements empowers end users like the age-old saying, “Give someone a fish, and you feed them for a day. Teach someone to fish, and you feed them for a lifetime.”

DIY Detector AI models launch immediately but also improve continually with usage.

That’s why Camio DIY Detectors are so exciting. They enable real-time search, alerts, and data to cover anything observed in video. End users create and deploy their own new custom AI models to detect the objects and activities that meet their particular needs — without code or data scientists. With a single click, end users label the things they want to detect. And with that unbounded vocabulary to describe what’s happening in any scene, we have the building blocks for radical advances — like those seen with GPT-4 — in the security industry.

Watch the 23-minute webinar introducing Camio DIY Detectors or jump to the demo.

Watershed Moments

The PC. The Web. The iPhone. And now, GPT.

Watershed moments create new demands, because people’s expectations are irreversibly raised. There’s a clear before and after. And people who experience the after rarely return to the before.

“Why is it that ChatGPT aces the bar exam, scoring higher than law school students, yet my security system stupidly generates 10,000 false alarms per week?”

The amazing performance of ChatGPT — with its use of Large Language Models (LLMs) derived from seminal research on Transformers (“Attention Is All You Need”) has inspired everyone. The natural language flexibility and adaptation to context is so powerful that Artificial General Intelligence (AGI) feels more within reach than ever before. At the same time, research into Augmented Reality (AR) is producing powerful perception technologies. The Big Tech AI wars are producing amazing peace dividends.

The fastest way to apply those peace dividends to security video is via Camio real-time video search, because Camio supports natural language query processing and filtering throughout its video processing pipeline, end-to-end. Camio is also based on Cloud Native Kubernetes so that it can run workloads at the edge, near edge, or cloud depending on where compute capacity is available. And the total compute capacity required is reduced by as much as 70% via Camio’s multi-stage query filtering that predicates each stage on preconditions of the prior stages (e.g., Camio knows not to compute expensive pose estimation to detect crouching or crawling under a car for catalytic converter theft if there were no humans near vehicles in the first place).

That combination — of search at its core and virtualization in its delivery — means that Camio incorporates AI advances quickly, because there are no complicated settings, dialogs, or architectural constraints that block the expanded understanding of what’s happening in any scene.

Instead of thousands of unmonitored false alarms, GSOCs get user-defined, actionable events. AI-generated video summaries include natural language descriptions unique to each environment. The programmatic actions and context-aware interactions with the system give staff a 50x force multiplier in their incident detection and response times.

In fact, the security industry’s pursuit of a “single pane of glass” is mostly a response to old systems that were so complicated that the operator training alone demanded a single UI. As AI enables machines to triage and act on our behalf, the best UI is no UI — with natural language as a close second. Rather than users learning the system, the system learns from its users.

Security 2.0

With building blocks like DIY Detectors and LLMs like GPT, it’s realistic to expect that security directors will soon be able to describe their objectives in simple natural language to have their security systems deliver the data, events, and programmatic actions they want to see. Much like GPT translates natural language into SQL queries and code today, the natural language references to DIY Detectors like “security guard”, “float”, “diving in the shallow end”, or “beer cooler” become observable in real-time for automated policy compliance.

For example, the head of security for a multi-family residential property company will be able to say:

“Alert our on-premise guards whenever there are safety concerns at any of our pools. We’ve seen problems whenever people bring beer coolers or use floats/rafts in the pool. Unaccompanied kids are a big risk. When the pool is crowded, send a lifeguard. At night, make sure the pool lights are on and the gate is closed. Also, nobody should be in the pool area between 11pm and 6am. If there’s no guard on-premise, make sure our GSOC dispatches one at night. But if our uniformed guard shows up to clear the scene, it’s safe for our GSOC to ignore that scene until it’s cleared of people.”

This is a tectonic shift. It will be fast. Buckle up! Learn more at https://camio.com/diy.

--

--

Camio co-founder & CEO. Making real-time video smart and useful.◔◔ ☁