Yuchen's blog

YAN Yuchen

Human Language Technology Center
Department of Computer Science and Engineering
The Hong Kong University of Science & Technology
HKUST, Clear Water Bay, Kowloon, Hong Kong
lab +852 2358-8831 · room 2602 (lifts 27-30)
yyanaa@cs.ust.hk · http://www.cs.ust.hk/~yyanaa

I am on the third year of my PhD study in Computer Science from HKUST.

Research Interests

Automatic Semantic Role Labeling

Semantic Role Labeling (SRL) is a task that assign semantic roles to segments of a sentence. Semantic roles are who did-what to-whom, for-whom, when, where, why and how. Here is an example

Here is demo page that takes a sentence and performs the automatic SRL. You will need to login to use see this demo.
Username: user_0001@gmail.com
Password: user_0001

Semantic Role Labeling as Machine Translation Metric

Machine translation metric is a set of judgement rule that predicts the "degree of goodness" of a machine translation. Specifically, it computes the "degree of similarity" of a machine translation output against a human translation output. The current most widely used metric for machine translation is BLEU, which isn't quite convincing, since it uses simple n-gram matching. To define whether a machine translation is good or not, how good does it preserves the meaning is the thing to measure. Inspired by this idea, MEANT, a machine translation metric based on SRL, work like this:

Perform SRL on hypothesis sentence (machine translation output)
Perform SRL on reference sentence (human translation)
Count how many matching semantic frames matches between hypothesis sentence and reference sentence, and also how good does the content of each semantic frame matches
Aggregate the scores computed in the previous step to get a final score

Here is a demo of such a concept. To see this demo, you will need same username/password provided previously.

Machine Translation using TRAAM

We experiments on using a neural network variation of ITG, which we call it TRAAM (Transduction Recursive Auto Associative Memory). It aims at solving the label inference and token hallucination problem in ITG. ITG has the intrinsic advantage of being a tree model by itself, thus various constrains from traditional statistic NLP can be applied, such as such as a dependency tree constraint or an SRL constraint.

My research is still in progress. More information can be found from a pioneer work done by my senior (already graduated). Transduction Recursive Auto-Associative Memory: Learning Bilingual Compositional Distributed Vector Representations of Inversion Transduction Grammars

This same model can be used to produce a rap battle bot. You can input a hiphop sentence to the bot, and it will reply according to the TRAAM model. which is pretty fun. Link here (The microphone input cannot be used, because our site's HTTPS certificate went expired, which causes all microphone accesses to be blocked by the browser. Sorry.)

Main Duties

Develops the WebAPI of HLTC group. WebAPI is a project that can enables our binaries to be invoked remotely via HTTP calls.

Develops HLTC group's demos on the web.

Develops the basic neural network library used by HLTC group in C++. Repository Here. (Since this library is used internally only, no readme provided. Sorry)

Be the teaching assistance of the following courses for spring 2019:

Introduction to Natural Language Processing
Introduction to Artificial Intelligence