No-nonsense AI strategy, concepts, and tools for training teams and content developers

— AI NEWS —

Cheap AI “video scraping” can now extract data from any screen recording

Oct 18, 2024
Share:

Quick summary

Feeding screen recordings or other video into a video-enabled model such as Google’s Gemini allows you to extract data from the video.

Why it matters

In this experiment, a researcher needed to add up some numeric values scattered across twelve different emails. He made a screen recording of himself scrolling through the emails. He then got Google Gemini to extract the numbers from his screen recording into a CSV file for use in a spreadsheet.

While this is a simple example, the implications of the ability to video-scrape screencasts are significant. It means anything you can display on your screen (websites, apps, e-learning, etc.), and anything that can be captured as video from a phone or camera (books on a bookshelf, panoramic displays), has the potential to become usable input for AI.

Although several major models, including those from OpenAI and Anthropic, have research previews that demonstrate the ability to accept video as input, only Google Gemini has released this feature. This is probably because the computation costs of processing video are so high. However, computation costs will inevitably fall, so expect video as input to be widely available in the near future.

Don't fall behind

Subscribe to get free email updates.

AI News

Jan 30, 2025

Chinese firm DeepSeek shakes up the AI industry

Chinese firm DeepSeek rattles the AI world with claims that it trained its new frontier model at a fraction of the time and cost of comparable models created by US companies.

Jan 16, 2025

OpenAI plans to become a for-profit corporation

OpenAI is planning to shed its non-profit oversight and transition to a publicly traded, for-profit “public benefit corporation” to raise needed capital.

Oct 18, 2024

Cheap AI “video scraping” can now extract data from any screen recording

Feeding screen recordings or other video into a video-enabled model such as Google’s Gemini allows you to extract data from the video.

Aug 15, 2024

Access to AI boosts writing creativity and usefulness

A new study reveals that access to AI-generated ideas causes stories to be evaluated as more creative, better written, and more enjoyable—especially for less creative writers.

Jun 9, 2024

Illuminate turns technical documents into podcasts

An experimental project from Google shows how AI can turn complex documents into a podcast that makes the information easier to understand.