Spatial Tech Review
Posts
Amazon Bets the Farm on Anthropic's AI 🐔

Amazon Bets the Farm on Anthropic's AI 🐔

September 25, 2023

_{Sign up}_|_Advertise

Good morning. Here’s what’s leading each section today ✨

Headlines: Amazon will invest up to $4B in Anthropic
Startups & Fund Raising: qbiq secures $10 million in Seed for its GenAI real estate planning platform
Research: Generate high quality 3D scenes from 2D source in realtime
Quick Bytes: Computer vision guided this robot mower wins Hackaday Prize 2023

Must-reads 📰

Amazon will invest up to $4B in Anthropic, making Anthropic’s safe and steerable AI widely accessible to AWS customers. (Anthropic) 3-minute read
ChatGPT can now see, hear, and speak, allowing users to have a more intuitive experience (OpenAI) 5-minute read
Gucci will offer three virtual experiences, playing out across three global metaverse platforms simultaneously. (Vogue Business) 7-minute read
Web3 Could Change the Business Model of Creative Work by offering new tools for creators to earn and own assets, build wealth, and wrestle back control from powerful platforms and intermediaries. (HBR) 8-minute read
Tokenizing real-world assets on blockchains is growing among crypto natives and skeptics alike (CNBC) 3-minute read

Startups & Fund Raises 🦄

qbiq secures $10 million in Seed for its GenAI real estate planning platform from strategic investors including JLL Spark Global Ventures, 10D, Ocean Azul, Randomforest, and M-FUND (calcalistech) 2-minute read
Bastion raises $25 million in funding round led by a16z to provide custody and other services on one platform (Bloomberg) 2-minute read
Upland to sell NFTs to raise money for equitable playground access (Blockworks) 3-minute read

Research 🔬

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

3D visual grounding is a crucial skill for household robots, enabling them to understand and interact with their environment. Existing methods often rely on labeled data and struggle with complex language queries. In this research, the authors introduce LLM-Grounder, a novel approach that leverages Large Language Models (LLMs) to perform zero-shot, open-vocabulary 3D visual grounding. This means it can handle a wide range of natural language queries without the need for labeled training data. The significance lies in its potential to enhance robots' ability to comprehend and respond to human instructions in real-world scenarios.

How it works: LLM-Grounder operates by employing a Large Language Model (LLM) to break down complex natural language queries into meaningful parts. It then uses a visual grounding tool, such as OpenScene or LERF, to identify objects in a 3D environment. Afterward, the LLM assesses the spatial and commonsense relationships among the identified objects to make a final decision regarding their grounding. This entire process is performed without any reliance on labeled training data, allowing LLM-Grounder to adapt to new 3D scenes and handle diverse text queries effectively.

Between the lines: The paper addresses a significant challenge in robotics, where household robots need to understand and act upon human language instructions within their environment. Traditional approaches often require extensive labeled data, which can be impractical to obtain for every possible scenario. LLM-Grounder, on the other hand, leverages the power of large language models like GPT-3 to parse and interpret language, combined with visual grounding techniques, making it versatile and adaptable. The research suggests that LLMs can significantly enhance the grounding capability, especially when dealing with intricate language queries.

The bottom line: LLM-Grounder introduces a promising approach to 3D visual grounding for household robots. By combining the strengths of Large Language Models with visual grounding tools, it offers the ability to understand complex language queries and identify objects in a 3D scene without the need for pre-existing labeled data. This research has the potential to advance the field of robotics by improving robots' ability to interpret and respond to natural language instructions, making them more practical and versatile tools for various tasks in a home environment.

Quick Bytes ⚡️

Ever wondered how we bring those mesmerizing fabric textures to life? 🤔✨ Step behind the curtain with us as we reveal the magic that goes into creating the enchanting world of digital fabric transformation. 🌟
#shade#3dtexture#3dfabric#fabrictexture#digitalfabric
— VMOD 3D Library (@VMOD3DLibrary)
9:19 AM • Sep 20, 2023

Text-to-textile with VMOD 3D Library
Computer vision guided this robot mower wins Hackaday Prize 2023
The best AI-Generated 3D model tools of 2023
Hellsweeper VR Review: Visceral, Versatile VR Violence
A reliable and wearable system to recognize finger movements in real-time
Japanese researchers say they used AI to try and translate the noises of clucking chickens and learn whether they're excited, hungry, or scared
Guide: Cross-Chain Tokenized Assets

TOGETHER WITH METALYST

Hit the inbox of readers from Apple, Meta, Unity and more

Advertise with MetaLyst to get your brand or startup in front of the Who's Who of metaverse tech. Our readers are folks who love tech stuff – they create things, and invest in cool ideas. Your product could be their next favorite thing! Get in touch today.