Chinese (1-6 of 6 results)
search datasets

by

ooooooo

100

images

View Dataset

Pre-collected OCR datasets include images of natural scenes, handwritten texts, bills and documents, and test papers. The AI training data spans 20 languages, various natural environments, and diverse photographic angles. Annotated Imagery Data FileMarket provides a robust Annotated Imagery Data set designed to meet the diverse needs of various computer vision and machine learning tasks. This dataset is part of our extensive offerings, which also include Textual Data, Object Detection Data, Large Language Model (LLM) Data, and Deep Learning (DL) Data. Each category is meticulously crafted to ensure high-quality and comprehensive datasets that empower AI development. Specifications: Data Size: 50,000 images Collection Environment: The images cover a wide array of real-world scenarios, including shop signs, stop boards, posters, tickets, road signs, comics, cover pictures, prompts/reminders, warnings, packaging instructions, menus, building signs, and more. Diversity: The dataset spans 5 languages and includes images from various natural scenes captured at multiple photographic angles (looking up, looking down, eye-level). Devices Used: Images are captured using cellphones and cameras, reflecting real-world usage. Image Parameters: All images are provided in .jpg format, and the corresponding annotation files are in .json format. Annotation Details: The dataset includes line-level quadrilateral bounding box annotations and text transcriptions. Accuracy: The error margin for each vertex of the quadrilateral bounding box is within 5 pixels, ensuring bounding box accuracy of at least 97%. The text transcription accuracy also meets or exceeds 97%. Unique Data Collection Method: FileMarket utilizes a community-driven approach to collect data, leveraging our extensive network of over 700k users across various Telegram apps. This method ensures that our datasets are diverse, real-world applicable, and ethically sourced, with full participant consent. This approach allows us to provide datasets that are both comprehensive and reflective of real-world scenarios, ensuring that your AI models are trained on the most relevant and diverse data available. By integrating our unique data collection method with the specialized categories we offer, FileMarket is committed to providing high-quality data solutions that support and enhance your AI and machine learning projects.

1234.6T

videos

View Dataset

Factori Identity Graph data helps brands enhance their first-party data in a privacy-compliant manner, enabling user reach across new platforms and channels. This dataset facilitates matching customer IDs with identities across multiple platforms and devices through deterministic or probabilistic matching methods. Key features: - Over 500 million device data records linked to hashed email data - Comprehensive identity resolution across platforms and devices - Multiple data points for accurate user matching - Privacy-compliant data linking methodology - Historical data covering the past 6 months - Monthly updates ensuring data freshness Perfect for identity resolution use cases to create unified client profiles (B2B/B2C) and data enrichment applications that leverage first-party data to build holistic audience segments for improved campaign targeting. Data is delivered through privacy-compliant data clean rooms, enriching your data based on specific requirements.

Overview: FileMarket's dataset provides 20,000 high-resolution images of palms, captured in a controlled environment to ensure consistent lighting and clarity. The dataset features a variety of palm types, from different angles and lighting conditions, making it an ideal resource for training AI models in areas such as object detection, plant recognition, and environmental applications. What Makes This Data Unique? This dataset is distinctive for its comprehensive and diverse representation of palms. The images were carefully captured by professional photographers in a studio setting, ensuring uniformity in quality and lighting. The wide range of palm types, along with various angles and poses, allows for nuanced model training, including distinguishing between species, leaf shapes, and growth patterns. The consistency of the imagery eliminates the need for excessive preprocessing, enabling quicker integration into machine learning and deep learning workflows. Data Sourcing: The palm images were sourced through professional shoots in a studio environment, guaranteeing consistency across the dataset. Each image is shot with optimal lighting and framing to enhance visual clarity. The photographers have experience in nature and botanical photography, ensuring that each photo is of exceptional quality and is suited for scientific and technical applications. Primary Use-Cases: This dataset can be leveraged in a wide array of AI and machine learning contexts, including: Object Detection Data: The high clarity and consistent imagery make it perfect for training models that focus on detecting palm trees, their leaves, and different types of foliage. Machine Learning (ML) Data: The diversity of palm species and the variety of captured angles provide a robust dataset for training models aimed at plant identification, classification, and recognition. Deep Learning (DL) Data: The multi-angle shots of palms are ideal for deep learning applications that require complex features, such as image segmentation, object tracking, and even 3D reconstruction of plant structures. Environmental AI Applications: With detailed imagery, this dataset is suited for models used in environmental analysis, where palm trees play a role in ecosystem recognition or climate change studies. Broader Data Offering: This dataset is a valuable addition to FileMarket’s extensive data offerings. It can be easily integrated with other datasets, such as those related to geography, climate, or biodiversity, creating more holistic AI models. Whether you are developing applications for botany research, environmental monitoring, or advanced plant recognition, this dataset is a foundational asset for AI training.

20K

images

View Dataset

by

check paragraph. check paragraph.

14

images

View Dataset
Showing datasets per page