Build a Personalized Recommendation System on Alibaba Cloud in Three Steps

Background

Overview

Figure 1. The overall framework of recommendation algorithms

Schema Design

Figure 2. Schema of tables in AnalyticDB for PostgreSQL in a personalized recommendation system
CREATE TABLE news (
news_id bigint,
create_time timestamp,
title varchar(100),
content varchar(200),
keywords varchar(50),
click_times bigint,
two_hour_click_times bigint,
news_vector real[],
primary key (news_id)
) distributed by (news_id);
CREATE TABLE browses_history (
browse_id bigint,
news_id bigint,
person_id bigint,
browse_time timestamp,
primary key (browse_id)
) distributed by (browse_id);
CREATE TABLE person(
person_id bigint,
age bigint,
star float,
primary key (person_id)
) distributed by (person_id);

Implementing a Personalized Recommendation System

1) Extract News Feature Vectors

select feature_extractor('text', 'ADB For PG is very good!') ;
insert into news(news_id, create_time, title, content, 
keywords, click_times,two_hour_click_times)
values(1, now(),'South Korea military: North Korea launched a UFO eastward in the north Pyeongan Province','According to South Korea's Joint Staff Headquarters, at about 16:30 p.m local time today, North Korea launched a UFO eastward in the north Pyeongan Province.', 'South Korea, North Korea, UFO', 123, 3);

2) Extract User Feature Vectors

select keywords    
from Person p, Browses_History bh, News n
where p.person_id = bh.person_id and bh.news_id = n.news_id and p.person_id = 9527;
select feature_extractor('text', 'NBA sports,finals,Miami Heat,Houston Rockets'));

3) Provide News Recommendations Based on User Feature Vectors

select news_id, title, content, (extract(epoch from (now()-create_time)) * w1 + click_times/extract(epoch from (now()-create_time)) * w2 + two_hour_click_times/extract(epoch from (now()-create_time)) * w3 + ann_distance * w4) as rank_score
from (select *, l2_distance(news_vector, feature_extractor('textf', 'NBA sports,finals,Miami Heat,Houston Rockets')) as ann_distance from news order by ann_distance desc limit 500) S
order by rank_score desc;
  • ann_distance: indicates the degree of correlation between the user and the news.
  • create_time: indicates the time when the news was published.
  • click_times/(now()-create_time): indicates the total CTR of the news.
  • two_hour_click_times/(now()-create_time): indicates the CTR of the news within the last two hours.
  • w1, w2, w3, and w4: indicate the weights of all attributes during the learning of the logistic regression model.

Original Source:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com