Ron contact:
roncwu@gmail.com
~ Projects ~ Notes

Finance

1. Algorithmic Trading

The following is a backtest of my mean-reversion algorithm from Quantopian. It rebalances weekly, longing previous week worst preformed (bottom 10%) of liquid stocks and shorting best preformed stocks. During the backtest period: 2015-03 to 2016-04, it consistently outperformed benchmark, and it had low beta and low volatility.

#liquid securities 
low_returns = recent_returns.percentile_between(0,10,mask=high_dollar_volume)
high_returns = recent_returns.percentile_between(90,100,mask=high_dollar_volume)

#For each security in the universe, order long or short positions with equal weights
    if data.can_trade(stock):
        if stock in context.long_secs.index:
            order_target_percent(stock, context.long_weight)
        elif stock in context.short_secs.index:
            order_target_percent(stock, context.short_weight)

There are many advanced techniques out there that are often supplements to the simple mean-reversion and momentum strategies: (check them out in my other project categories)

► Machine Learning: PsychSignal's stockTwits trader mood, run Twitter feed in Alchemy api (NLP), chart/data pattern recognition & categorization (supervised/unsupervised learning).

► Numerical Methods: modern portfolio theory, dynamic mode decomposition (PCA), random matrix theory.

► Web/App design: using Python, C++, C#, R, SQL, and JavaScript, I built high-quality financial apps for portfolio manager and traders to automate their works.

2. Paper-Money Option Trading

While learning portfolio theory, I created trading algorithms; while learning derivatives, I traded paperMoney options from TD Ameritrade.

As of 5/25/16, after a week of straight breakout descending triangle, AAPL is on the rise. The 100 call monthly's ATM is traded at $1.76, expiring in 3 weeks' time. One can also sell out of money call option to make a bull spread taking advantage of the low VIX environment.

As of 06/07/16, during the last 2 weeks, AAPL didn't rise to $100-$103 range that I anticipated. The day after I bought the $100 call and sold $105 call, AAPL had 1% gain, which could bring in 31% profit on a single day. However, things went South after Memorial weekend, compounded by negative Theta, as it got closer to the expiry. I closed the $100 call after Memorial 6/1. I ended losing (1.1 - 1.7) / 1.7 = - 35% on the $100 calls. Looking forward I can probably let the $105 calls expire and mitigate losses to (1.1 + .34 - 1.7) / 1.7 = - 15%.

A few lessons I learned from this case: cashing in profits quickly; paying more attention to the calendar dynamics -- 2 weeks before the expiry is too much a burden to bear; never executing naked options; documenting what leads to the trading decisions and making reflections on what worked and did not.

3. Computational Finance

With deep understanding of numerical analysis, PDE, machine learning, statistics and probability theory, I wrote computational algorithms to meet both accuracy and speed.

Being familiar with many popular computational toolboxes and libraries, as a risk analyst, I compute metrics and present them in graphs. In the graph below, I used Plotly to make an interactive graph.

library(plotly)
p <- plot_ly(economics, x = date, y = 100*unemploy / pop, name = "unemply rate") %>% 
      add_trace(y = fitted(loess(100*unemploy / pop ~ as.numeric(date))),
                x = date, name="trend") %>%
      layout(title="Unemployment")
config(p, displayModeBar=FALSE)
4. My Notes

Statistics, Time Series, Computation Finance, Derivative Pricing, Algorithmic Trading

ADT, OOP & Algorithm, Design Patterns, Software Testing, Compiler Language, Operating System

Close

Machine Learning

1. HEP meets ML

The Large Hadron Collider (LHC) in Geneva, Switzerland is the single largest machine in the world. The main mission of LHC is to detect new particles and search for new physics beyond the standard model. Not only does the machine produce unmanageable quantities of data, but also the measurements themselves such as impact parameter, momentum, are indeterministic. In terms of machine learning HMM language, the latent states (unobservable short life quantum intermediate processes) are not the same as the observables. Figuring out what is noise and what isn't becomes the essential task. Thus no wonder why statistics and machine learning have to be the central tools for analysis.

To learn more about how ML meets HEP, watch this talk by Kyle Cranmer (NYU), who was an invited speaker of the 2016 NIPS conference. While studying under Prof. Cranmer, I wrote a report describing jet algorithm and simulation algorithm. It can be found here.

2. My Notes

► [Theory] Machine Learning, Neural Network, Deep Learning, Reinforcement Learning, Artificial Intelligence

► [Practice] Convolutional Neural Networks for Image Processing, Recurrent Neural Networks for Natural Language Processing with TensorFlow

► [Practice] Deep Reinforcement Learning with TensorFlow [in progress]

Close

Numerical Methods

1. Fusion Energy

Nuclear fusion plants are being experimentally built around the world. Fusion energy may become one of the most promising energy resources, by generating an infinitely amount of sustainable energy, while producing 100 times less radiative wastes than existing nuclear power plants, all of which are powered by fission. In contrast to fission's deadly reactant, Uranium, fusion uses stable and safe hydrogen.

project image

Trying to tackle some of the technical challenges surrounding fusion devices, my advisor and I proposed a new algorithm for drawing 3D magnetic fields lines inside of reactors and for determining if the magnetic configurations were controllable or not. In the research, we developed a new algorithmic substitute to Runge-Kutta, by modifying Lax-Friedrichs sweeping method, which linked Courant stability in numerical analysis to the physical stability of the magnetic configurations inside of the reactors.

Such project was a perfect fit for me, since it demanded a combined comprehension of advanced physics, rigorous mathematics and computation efficiency. The slides are provided here and the paper can be read here.

2. My Notes

► [Theory] Numerical Analysis, High Performance Computing, Big Data, Quantum Computing

► [Practice] Big Data Machine Learning, Distributed Machine Learning, Parallel Machine Learning with Scala, Spark

► [Practice] Quantum Computing on the Cloud, Quantum Machine Learning, Quantum Neural Network (Tensor Networks), Quantum Information Science and AdS/CFT Correspondence of Quantum Gravity [in progress]

3. Quantum MeetUp Notes

I am a regular member of NY Quantum Computing and NY Quantum Theory Meetup Groups. The groups consist of computer scientists, mathematicians and physicists. We discuss the latest developing in the fields and what they can be applied to improve our business.

Quantum Computing: Intro, logistics (3.2.17)
Quantum Calculations: Matrix Product States (3.11.17)

Close

Competition

1. Hackerrank Competitive Programming

Being a O(1) programmer on Hackerrank, as of March 2017, I rank #773 in the world, #111 in the United States and rank #2 among Columbia students.

project image

2. Kaggle Machine Learning Contest

I finished the Kaggle 2-sigma global contest in the 29th place (silver medal) against 2000+ data scientists.

project image

I created this Shiny app (written in R) for data exploratory analysis. The app can be opened by clicking the pic below.

project image

To run it locally, use the source code provided here and download the dataset from kaggle. Later I have added more functionalities, such as corr(), lagging(), unit root, spectrogram, kalman, roll mean, exp smoothing, etc.

Hackathon

1. Quantopian Alpha Hackathon 2016

According to Quantpedia, "Reversal During Earnings Announcements" and "Reversal in Post-Earnings Announcement Drift" are the only few strategies that have yearly return over 40%. The focus of this year's Quantopian Alpha Hackathon was centered around earnings. I built a trading algorithm, originated from Mr. Seong Lee's algo. Our algo reached 560% returns from the 8-year period, 2007-2014, in comparison to SP500, which only grew 70% during the period.

project image

The central idea of our strategy was PEAD + Buffett doctrine("Diversification is protection against ignorance"). Our portfolio had 50% in cash (and keep leverage under 1) most of the time, so that when opportunity came we would make all-in to the stocks that passed our very stringent tests, and quickly cashed in profits. In the end, our portfolio had very low correlation to the market, strong protection against crisis, and relatively acceptable volatility. A full backtest result can be found here.

2. Columbia DataScience Hackathon 2016

It is well-known that tasks that are difficult for human beings are usually very easy for computers and tasks that are very difficult for computers are easy for human beings. For example graph partitioning is NP-hard, but very easy for humans because we can just see it.

Participated in the Data Hackathon, I used graph theory to build networks of asset classes (stocks, index funds, commodities, economic indicators, ...) and depicting relationship among them. The graph will help to narrow to the factors that are most related to the assets.

How to define relationships? The easiest one was Pearson correlation, which gave us a bidirectional graph. After that, we borrowed the idea of autoregression but we used historical data of other stocks to predict stocks, then we let the coefficient of determination (r-square of linear regression) as the new edge weight, which we called it the predicting power, which gave us a directed graph. Many other machine learning algorithms can also be used. The accuracy score of each algorithm is the edge new weight or combining many weights to one edge weight like boosting.

The project is just a beginning. We can go beyond one-step prediction and move to multi-step, i.e. deep learning and looking for hidden layers. In doing so, better indicators may be formed or better financial instruments may be structured. The source code can be found here, and a NetJSON visualization can be opened by clicking the image below.

project image

3. Columbia DevFest Hackathon 2016

In the wake of hacking activities across the nation and sharing information securely is at the greatest dangerous ever, I participated in the App Hackathon, and we wanted to build a safest social network site.

The idea was that a safer way to share messages was to use no encryption and no personal information attached to it. The app allows user to go to our website and claim alias, then post the message there and give his alias to the people he wants to share the message with. His friends will go to our site to read the message without a password, and the alias will expire and become available for other users to claim in 2 days if the original creator of the alias doesn't login for the period of time.

The source code can be found here, and the site can be opened here.

Coding Exercise

Exercises from Gayle Laakmann McDowell's book Cracking the Coding Interview. The official Python solution were missing a few chapters, so I made it complete. Your comments or pointing out mistakes are highly valuable and I can be reached at roncwu@gmail.com.

Data Structures
Chapter 1: Arrays & Strings
Chapter 2: Linked Lists
Chapter 3: Stacks and Queues
Chapter 4: Tree and Graphs

Algorithms
Chapter 5: Bit Manipulation
Chapter 6: Math and Logic Puzzles
Chapter 7: Object-Oriented Design
Chapter 8: Recursion and Dynamic Programming
Chapter 9: System Design and Scalability
Chapter 10: Sorting and Searching
Chapter 11: Testing

Language Specifics
Chapter 12: C++
Chapter 13: Java
Chapter 14: SQL
Chapter 15: Threads, Locks

YouTube Videos

How Things Are Made by Physics?

When I was a physics tutor at Columbia, I created series of videos to reach out to students to promote physics and to promote my tutoring business. Below was the first video of the series, made around thanksgiving in 2014.

Close