NTCIR-18

The 18th NII Testbeds and Community for Information access Research

Organizer

National Institute of Informatics (NII), Japan

Conference Date

June 10-13, 2025

Location

Tokyo, Japan

Participating Team

IMNTPU Team, National Taipei University

2 Tasks Participated

7/8 First Place Awards

4 Languages Supported

AI Research Approach

Conference Introduction

About NTCIR-18

NTCIR-18 is a biennial evaluation workshop series organized by the National Institute of Informatics (NII) to advance research in information access technologies such as information retrieval, question answering, and text summarization. The conference brings together international researchers to collaborate on cutting-edge AI and NLP challenges.

Research Focus

Information access technologies, multilingual AI, and specialized domain applications

Global Impact

Advancing international collaboration in AI research and evaluation methodologies

Our Achievement

Outstanding performance in medical AI and financial reasoning tasks

Innovation Highlight

Novel approaches combining multiple AI models and prompt engineering

Tasks Participated

The IMNTPU team participated in two main tasks during NTCIR-18, achieving outstanding results:

MedNLP-CHAT

Medical Natural Language Processing Dialogue System Evaluation - This task focuses on analyzing whether medical chatbot responses to patient questions contain potential medical, legal, or ethical risks. The dataset includes Japanese and German corpora with corresponding English and French translations. Our team achieved exceptional performance, winning first place in 7 out of 8 subtasks in multilingual risk assessment Joint Accuracy metrics.

FinArg-2

Financial Argument Temporal Reasoning - This task focuses on temporal reasoning of statements in earnings calls and social media, requiring models to identify temporal reference points in statements and predict the validity period of statement content. Our team combined fine-tuning models and prompt engineering approaches, achieving first place in key subtasks and demonstrating strong potential in financial semantic reasoning applications.

Research Methodology

Our research employed innovative methodologies for both tasks, combining state-of-the-art approaches in natural language processing and machine learning.

🏥

MedNLP-CHAT Methodology

🤖

Agentic AI Framework

Multi-model collaborative approach with majority voting and weighted scoring systems

💬

Prompt Engineering

Three-shot prompting with GPT-4o, Claude 3.5 Sonnet, Mistral small latest

⚖️

Risk Assessment

Medical, legal, and ethical risk evaluation across multiple languages

💰

FinArg-2 Methodology

🔧

Model Fine-tuning

BERT, RoBERTa, DistilBERT for encoder-based tasks; GPT-4o Mini for decoder-based

📊

Data Augmentation

Semantic variation generation to address class imbalance in financial datasets

⏰

Temporal Reasoning

Classification of argument claims based on temporal references in financial contexts

Technical Approach

Natural Language Processing Machine Learning Deep Learning Prompt Engineering Multilingual AI Medical Risk Assessment Financial Reasoning Temporal Analysis Ensemble Methods Large Language Models

Research Results

Outstanding Achievement

The IMNTPU team achieved remarkable results in both tasks. In FinArg-2, we secured first place in key subtasks through our innovative combination of fine-tuning models and prompt engineering. For MedNLP-CHAT, our multi-model collaborative approach effectively identified potential risks, achieving first place in 7 out of 8 subtasks in multilingual risk assessment. These achievements demonstrate our team's excellence in financial semantic reasoning and medical risk assessment applications.