Banking & Payments

Tencent’s WeBank applying “federated learning” in A.I.

China’s first mobile bank, Tencent’s WeBank, is partnering with a H.K. startup to access decentralized sources of data.

Published

5 years ago

July 29, 2019

Larissa Ku

WeBank, the first digital bank established in China, is developing new models in artificial intelligence called federated learning as regulators bolster privacy and security rules.

The bank, founded in 2014 by Tencent as an all-digital financial institution, has piloted the technology in China with a national electronic invoice (fapiao) centre and has developed its first federated learning model for credit rating in April.

“The invoice information is secret,” said Chen Tianjian, deputy general manager at WeBank’s A.I. department. “Invoice centers were willing to work with WeBank because they would remain the only owner and controller of their data.”

This new model leveraged WeBank’s own data, as well as the encrypted invoice data which stays on the invoice center’s servers. The co-developed model is strictly restricted to measuring the credit risk of small and micro-enterprises.

So far, Chen says, the model has halved the number of defaults among WeBank’s loans to these customers.

Decentralized data

Artificial intelligence relies on lots of data. Without enough data, there can’t be effective model and training. And without that, you can’t glean insights or rely on the software to improve.

So technology companies or those using A.I., like banks, are hungry for data, particularly relevant data from targeted users (like customers).

But this demand is running smack into protective walls for privacy and security. Both regulators and tech companies are promoting measures to ensure individuals retain control over their personal data. This is leading to series of measures requiring consent, prohibitions against selling data, or storing it outside of certain countries’ borders.

Federated learning is a distributed and encrypted system, which natually protects the customer’s privacy
Chen Tianjian, WeBank

The risk is that data becomes as siloed as today’s banking departments – and therefore it becomes very cumbersome and expensive to properly train a computer model.

Federated data

WeBank’s A.I. team is working with a Hong Kong-based startup called Clustar on “federated learning”. It is designed to integrate data scattered throughout different departments, companies, or jurisdictions. It takes data that is inert and makes it useful.

Federated learning requires companies and institutions to collaborate. But they can’t share or transfer data, which remains distributed and encrypted.

Take the example of WeBank’s pilot efforts to manage credit risk.

Traditional banks have little visibility over customers who are applying for loans from multiple institutions. Yet such information is vital to scoring a borrower’s risk.

Chen Kai, founder of Clustar, and a professor at HKUST

Using federated learning, multiple banks jointly develop a model based on sub-models in each bank’s individual environment.

Let’s say a customer applies for a loan from Bank A. The bank’s credit officer has no way of knowing whether the applicant is already borrowing from other banks. But the co-developed algorithm could produce a general credit score suggesting this customer is risky. However, the algorithm doesn’t have the customer’s credit record, because each bank owns part of the record and therefore replies part of the question, while all related information is encrypted. Then it’s up to Bank A to either serve her a loan at a higher interest rate, or decline her altogether.

“Federated learning is a distributed and encrypted system, which natually protects the customer’s privacy,” said Chen.

So one use case for federated learning is to link information to a user identity from among the universe of similar institutions – in this case, banks. Every time a given identity (a person) interacts with the banks, information is added to that identity. Other uses cases can help resolve differences in data for the same group of customers: for example, a bank and an e-commerce company may have the same customer base but a different set of data.

Computing challenges

Some of the concepts behind federated learning are similar to decentralized ledger technology, a.k.a. blockchain. Both technologies seek solutions at the level of the marketplace, rather than within single organizations. Both seek to enable transactions while preserving encryption, privacy, and security.

The challenge for federated learning is that making calculations using data that remains encrypted is incredibly difficult. Blockchain doesn’t require calculation, it’s just creating and validating information on a shared ledger: it’s just infrastructure. Federated learning is the prospect of adding brains to otherwise unconnected points of data, while leaving security intact.

Chen says encryption increases calculation volumes by hundreds of times. If A.I. training takes 10 hours in an unencrypted model, then an encrypted training session would require at least 100 hours, and maybe 1,000.

For WeBank’s collaboration with electronic invoice data, the model for credit risk took up to four months to become useful. But invoices involve small data sets, and a risk control model does not need to be updated quickly. It normally updates every six months, according to Chen.

“There is no problem to spend a hundred hours modeling in this case,” Chen said. “But the current computing power will limit the implementation in many other areas that iterate quickly or that are data-intensive.”

However, WeBank is facing technical challenges to further implement federated A.I. model to more complicated use cases. That’s where Clustar comes in.

Clustar’s role

Clustar is an A.I. infrastructure startup launched by Chen Kai, a professor at Hong Kong University of Science and Technology (HKUST), in 2018. It has received $10 million investments from investors such as Sequoia Capital and Stone VC. Clustar reached a valuation of $80 million last month.

(Universities in Hong Kong have become an important force in recent years to push tech industrialisations. SenseTime was founded in 2014 by professors at Chinese University of Hong Kong and now it became a unicorn in AI space. Da Jiang or DJI, the world leader in camera drones, was founded in HKUST’s dormitory by Wang Frank Wang, a student at that time, backed by his professor Li Zexiang.)

Earlier this year, Clustar did a proof of concept with WeBank to speed up the computing process for federated learning.

“We’ve chosen Clustar because there is very few companies in the market that can at the same time provide solutions for both network transmission optimization and computational optimization,” said Chen Tianjian.

He explained the two solutions that Clustar brought.

FPGA and RDMA

First, a typical computer can only handle a calculation of 64 bits, but the encryption creates super-large figures, for example, of 1,024 bits. Therefore a computer would have to break down such large calculations in bite sizes, that is, in chunks of 64 bits. “It will become super lengthy,” said Zhang Junxue, executive vice president at Clustar.

The startup has brought in a technique called FPGA, or field-programmable gate array, to solve this. FPGA is hardware to customize a computer’s core system in order to expand its byte bandwidth, to 1,024 bits for example. It lets computers handle large calculations in a speedy manner.

In the PoC with WeBank, it sped up the process by four to five times, according to WeBank’s Chen.

The second solution is on the transmission side. Chen Kai, founder of Clustar, told DigFin that because of delay in data transmission, even if there are 10 computers calculating together, the actual computing power might only equate to two computers.

The solution is called RDMA (remote direct memory access), which will remove all the intermediate steps (copy the result from the computing card to the network card, copy again to another computer’s network card, etc), and directly write the result to another computer.

“It may be copied five to six times,” WeBank’s Chen said. “Because the transmission is very slow, most of the time, we might be waiting for the transfer of the result, instead of modeling.”

He said that Clustar’s RDMA solution can also speed up the process by four to five times – raising the prospect that federated A.I. can be applied to far more complex products and markets.

Related Topics:A.I.AI Chen Kai Chen Tianjian Clustar DJi featured Federated Learning HKUST SenseTime WeBank

Up Next

Monday Brief: the Capital One hack

Don't Miss

Citi, Mastercard vouch for digital coupons

Digital Finance

Tencent’s WeBank applying “federated learning” in A.I.

Banking & Payments

Tencent’s WeBank applying “federated learning” in A.I.

Decentralized data

Federated data

Computing challenges

Clustar’s role

FPGA and RDMA

DigFin direct!

The bank-v.-fintech battle for payments moves into B2B

How far outside of China can China’s fintech giants go?

Changing banks, and money | Hubert Knapp | VOX Ep. 82

Kyobo’s digital life insurer peps up after a slow decade

Crypto custody questions | Alessio Quaglini | VOX 83