
by Success.ai

Access to over 1.5 billion unique mobile users across APAC, EU, North America, and MENA regions. This dataset combines raw data signals from 900+ global sources, validated, modeled, and segmented into thousands of mobile audience segments. Key features: -Comprehensive audience categorization including interests, demographics, behavioral, and geographic data -Brand Shoppers segments based on real-world visits to brand outlets -Place Category Visitors segments reflecting high-intent location visits -Demographic data including gender, age, marital status, education (US-specific enhanced data) -Geo-Behavioral segments based on frequency of visits to specific locations -Interest and Intent data derived from online browsing and shopping behavior -Event-related audience segments for sports, culture, and gaming -App usage categorization based on installed mobile applications -Specialized US segments including auto ownership, financial behavior, and B2B audiences Perfect for advertising and marketing campaign ideation, personalized messaging, market intelligence, credit scoring, and retail analytics. Data is collected dynamically and provided monthly with options for pre-made audience segments or custom segments tailored to specific requirements.
15B
texts
by Success.ai

Factori cross-device data identifies customers across multiple internet-connected devices through encrypted MAIDs and IP addresses. We build comprehensive cross-device data by identifying household users across devices and analyzing their data characteristics to determine intent and interests, enabling advertisers to target viewers on Connected TV platforms effectively. Key features: -3 billion records of cross-device connections -Daily data collection with same-day delivery frequency -One full year of historical data accessible -Household-level device mapping -Anonymous user identification across multiple devices -IP address mapping for household determination -Connected TV viewing patterns and preferences -Cross-platform identity resolution -Comprehensive device usage patterns Wide range of business applications such as consumer analytics applications, cross-device attribution, sales forecasting, and coordinated advertising campaigns. This dataset allows organizations to determine household-level reach, conduct aggregated analysis of household device identities, and ensure consumers don't receive repetitive messaging across platforms. Data is collected dynamically and delivered daily via server-to-server transfer.
30M
texts
by Success.ai

Factori Web Data contains fresh web browsing data of users across desktop and mobile devices, indicating search intent, purchase intent, and online category interests. This comprehensive dataset tracks user activity across popular websites worldwide, delivered as a daily feed via server-to-server transfer. Key features: -Over 2 billion records of web browsing activity -Daily data collection with daily delivery frequency -Six months of historical data accessible -Anonymous user identification across devices -IP address data for geographic segmentation -Search query capture for intent analysis -Website category classification -Cross-device browsing behavior patterns -Interest and intent indicators from browsing activity Perfect for personalized targeting applications, data enrichment projects, market intelligence analysis, and fraud detection/cybersecurity initiatives. This dataset allows organizations to analyze web behavior patterns and build highly accurate audience segments based on web activity for targeting ads based on interest categories and search/browsing intent. Data is collected dynamically and provided through suitable delivery methods on daily, weekly, or monthly intervals.
40T
images
by Success.ai

Pre-collected OCR datasets include images of natural scenes, handwritten texts, bills and documents, and test papers. The AI training data spans 20 languages, various natural environments, and diverse photographic angles. Annotated Imagery Data FileMarket provides a robust Annotated Imagery Data set designed to meet the diverse needs of various computer vision and machine learning tasks. This dataset is part of our extensive offerings, which also include Textual Data, Object Detection Data, Large Language Model (LLM) Data, and Deep Learning (DL) Data. Each category is meticulously crafted to ensure high-quality and comprehensive datasets that empower AI development. Specifications: Data Size: 50,000 images Collection Environment: The images cover a wide array of real-world scenarios, including shop signs, stop boards, posters, tickets, road signs, comics, cover pictures, prompts/reminders, warnings, packaging instructions, menus, building signs, and more. Diversity: The dataset spans 5 languages and includes images from various natural scenes captured at multiple photographic angles (looking up, looking down, eye-level). Devices Used: Images are captured using cellphones and cameras, reflecting real-world usage. Image Parameters: All images are provided in .jpg format, and the corresponding annotation files are in .json format. Annotation Details: The dataset includes line-level quadrilateral bounding box annotations and text transcriptions. Accuracy: The error margin for each vertex of the quadrilateral bounding box is within 5 pixels, ensuring bounding box accuracy of at least 97%. The text transcription accuracy also meets or exceeds 97%. Unique Data Collection Method: FileMarket utilizes a community-driven approach to collect data, leveraging our extensive network of over 700k users across various Telegram apps. This method ensures that our datasets are diverse, real-world applicable, and ethically sourced, with full participant consent. This approach allows us to provide datasets that are both comprehensive and reflective of real-world scenarios, ensuring that your AI models are trained on the most relevant and diverse data available. By integrating our unique data collection method with the specialized categories we offer, FileMarket is committed to providing high-quality data solutions that support and enhance your AI and machine learning projects.
500K
audios
100K
images