Sci Simple

New Science Research Articles Everyday

# Computer Science # Computer Vision and Pattern Recognition # Artificial Intelligence

Unlocking Ancient Secrets: Oracle Bones & AI

Discover how AI is transforming the study of ancient Chinese oracle bones.

Zijian Chen, Tingzhu Chen, Wenjun Zhang, Guangtao Zhai

― 6 min read


AI Meets Ancient Oracle AI Meets Ancient Oracle Bones inscriptions with AI technology. Revolutionizing the study of ancient
Table of Contents

Oracle bones are ancient artifacts used by the Shang dynasty in China for divination and rituals from around 1400 B.C. to 1100 B.C. These bones bear inscriptions that provide valuable insights into the thoughts, language, and culture of past societies. However, the task of interpreting these inscriptions is complex and often requires expert knowledge.

This is where OBI-Bench comes in. It's a newly created benchmark designed to assess the ability of large multi-modal models (LMMs) in handling tasks related to oracle bone inscriptions (OBI). The goal is to see if these advanced models can process and understand ancient scripts, helping scholars unlock the secrets hidden in these artifacts.

What is OBI-Bench?

OBI-Bench is a collection of 5,523 images of oracle bone inscriptions pulled from various sources. These images are not just pretty pictures; they represent five key tasks that are essential for understanding oracle bone scripts. These tasks include:

  1. Recognition: Finding specific characters in the images.
  2. Rejoining: Putting broken pieces of text back together.
  3. Classification: Sorting characters into their correct categories based on meaning.
  4. Retrieval: Searching for relevant images based on a query.
  5. Deciphering: Figuring out what the characters mean in a historical context.

Unlike other benchmarks, OBI-Bench is tailored specifically for the challenges presented by oracle bone inscriptions, pushing LMMs to perform at levels that match human experts.

The Importance of Oracle Bones

Oracle bones are like time capsules that reveal the beliefs and practices of the Shang dynasty. These inscriptions aren't just scribbles; they hold the keys to understanding ancient Chinese civilization. As exciting as it sounds, interpreting these inscriptions comes with its own set of challenges.

Over centuries, many bones have deteriorated. They've become fragmented and some are damaged, making it tricky to recognize or interpret the characters. Additionally, the wide range of styles used in these inscriptions can confuse even the most experienced scholars.

The Challenges

When trying to work with oracle bone inscriptions, researchers face several hurdles:

  1. Erosion and Damage: After being buried for thousands of years, many oracle bones have become eroded and fragmented. This makes it hard to identify characters.
  2. Rejoining Fragments: Piecing together broken pieces of text is essential but can be time-consuming and requires specialized knowledge.
  3. Stylistic Variation: The different styles of writing can make it hard to recognize and classify characters.
  4. Retrieval Difficulties: Creating large databases of these inscriptions is complicated due to the need to distinguish between similar characters.
  5. Translation Issues: Many oracle bones have characters that don't map directly to modern Chinese, making interpretation tricky.

Researchers have used traditional methods to tackle these issues. However, with the emergence of LMMs that have strong visual and reasoning capabilities, there's potential to significantly improve the process.

Enter LMMs

Large multi-modal models combine visual perception and language understanding, making them ideal for tackling complex tasks like those seen in OBI research. The main question is: Can these models help improve the study of oracle bone inscriptions?

To answer this, researchers evaluated 23 popular LMMs, both proprietary and open-source, across different tasks. The results were fascinating, showing that while LMMs have impressive capabilities, they still have room for improvement when it comes to fine-grained perception and interpretation of these ancient scripts.

The Five Key Tasks in OBI-Bench

Recognition

This task involves locating dense oracle bone characters in various contexts, like original bones or rubbings. Models are evaluated on how accurately they can identify characters in images.

Rejoining

Rejoining is like putting together a puzzle of broken text fragments. This task assesses how well models can stitch together these fractured pieces to form coherent text.

Classification

Each character from the oracle inscriptions needs to be sorted into its correct meaning. This task checks how reliable the models are in categorizing characters accurately.

Retrieval

When given a query, how well can the model find the right images in a database? This task measures the model's effectiveness at retrieving relevant results.

Deciphering

The ultimate goal of understanding oracle bones is to interpret their meanings. This task evaluates how well models can provide insights into the historical and cultural significance of the inscriptions.

Evaluation of LMMs

During the evaluation, it was found that even the most advanced models sometimes struggled with fine-grained recognition, but they performed reasonably well in deciphering tasks. Some models could interpret characters at a level comparable to untrained humans, indicating potential for future development in this area.

Key Findings

  1. Much Room for Improvement: LMMs still have significant work to do in tasks requiring precise recognition and rejoining of fragments.
  2. Sensitivity to Local Information: Many models failed to detect subtle features needed for recognition and rejoining tasks.
  3. Strong Classification and Retrieval Capabilities: LMMs showed promising results in classifying characters and retrieving relevant images, particularly for clearer datasets.
  4. Remarkable Deciphering Skills: Some models performed surprisingly well in deciphering tasks, suggesting they can offer new interpretations of undeciphered characters.

The Process: From Data Collection to Evaluation

To create OBI-Bench, researchers collected images from multiple sources, ensuring diversity in the data. They involved domain experts to annotate the images and refine the datasets. The evaluation involved using different types of queries, such as "What is in this image?" or “How many characters can you see?” to assess the model's understanding of the tasks.

Developing Datasets

Two specific datasets were created — the Original Oracle Bone Recognition (O2BR) dataset and the OBI-rejoin dataset, which serve as important resources for training and testing LMMs in the context of oracle bone inscriptions.

The Future of OBI Research

The findings from OBI-Bench suggest that LMMs can be valuable tools in the study of oracle bones. They present exciting possibilities for streamlining the research process, reducing the heavy manual workload typically associated with the deciphering of these ancient scripts.

Potential Directions

  1. Improved Preprocessing Techniques: By developing methods to enhance image quality, researchers may boost LMM performance.
  2. Fine-Tuning for Specific Datasets: Tailoring models to learn from the unique characteristics of oracle bones can enhance their interpretive abilities.
  3. Interactive Systems: Creating systems where users can ask questions about oracle bones in natural language will make the research process more accessible.

Conclusion

The exploration of oracle bone inscriptions through LMMs holds great promise for advancing our understanding of ancient civilizations. While there are still hurdles to overcome, the use of modern technology in this field could lead to exciting discoveries and greater insights into the rich tapestry of human history.

So, the next time you think of ancient scripts, remember that with a sprinkle of technology and a dash of innovation, the secrets of oracle bones may soon be within our grasp—just waiting to be deciphered!

Original Source

Title: OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?

Abstract: We introduce OBI-Bench, a holistic benchmark crafted to systematically evaluate large multi-modal models (LMMs) on whole-process oracle bone inscriptions (OBI) processing tasks demanding expert-level domain knowledge and deliberate cognition. OBI-Bench includes 5,523 meticulously collected diverse-sourced images, covering five key domain problems: recognition, rejoining, classification, retrieval, and deciphering. These images span centuries of archaeological findings and years of research by front-line scholars, comprising multi-stage font appearances from excavation to synthesis, such as original oracle bone, inked rubbings, oracle bone fragments, cropped single character, and handprinted character. Unlike existing benchmarks, OBI-Bench focuses on advanced visual perception and reasoning with OBI-specific knowledge, challenging LMMs to perform tasks akin to those faced by experts. The evaluation of 6 proprietary LMMs as well as 17 open-source LMMs highlights the substantial challenges and demands posed by OBI-Bench. Even the latest versions of GPT-4o, Gemini 1.5 Pro, and Qwen-VL-Max are still far from public-level humans in some fine-grained perception tasks. However, they perform at a level comparable to untrained humans in deciphering task, indicating remarkable capabilities in offering new interpretative perspectives and generating creative guesses. We hope OBI-Bench can facilitate the community to develop domain-specific multi-modal foundation models towards ancient language research and delve deeper to discover and enhance these untapped potentials of LMMs.

Authors: Zijian Chen, Tingzhu Chen, Wenjun Zhang, Guangtao Zhai

Last Update: 2024-12-02 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2412.01175

Source PDF: https://arxiv.org/pdf/2412.01175

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles