The tagging system is the core of data analysis, which is related to business analysis, user portraits, recommendation strategies and other aspects. An accurate and efficient labeling system can provide rich materials and experience accumulation for subsequent analysis. In this article, we'll share how to avoid common mistakes when building a labeling system and provide a systematic approach to building and optimizing a labeling system.
The labeling system is definitely the most worthy of priority in data analysis work. Because it is related to all jobs, business analysis, placement analysis, user portraits, recommendation strategies, product operations, ...... It's all driven by labels.
If the labeling system is done well, there will be enough materials for follow-up analysis to accumulate experience. The labeling system is poorly done, not to mention the wasted effort, and it has not been relied on when I do in-depth analysis later.
So how do you do that? Let's share it briefly today.
The most common mistake is that the label is just a basket of everything you put in.
Not to mention, when tagging users, "high value", "potential", and "like XX" are casually pasted. Even "high value", "high quality" and "high quality" with similar names exist at the same time.
These vices can be seen everywhere when working on user portrait projects. People often show off to me proudly: "Mr. Chen, we are so good, we have more than 3000 ...... user tags"
At this time, you just have to ask him:
3000 tags, how many are there for business?
3000 tags, how much value is generated?
Ya was discouraged and dropped a sentence: I am still exploring how to apply it...... And then ran away.
Why? It's because these tags are just a bunch of dimensions lying in the database. If you want to use the business, you have to think first: what are the needs of the business, why should he use labels.
When building labels, there are at least 3 classes of completely different requirements.
Quickly identify the need for value. Management is most afraid of seeing hundreds of pages of PPT reports. Labels can effectively distill the meaning of the business and identify the most critical factors.
Like what:
In this way, when the performance fluctuates, the management can see at a glance: Oh, it's the problem of XX place. Saves a lot of time.
Find what you need to be inspired to plan. Operations departments like to ask the most:
In the final analysis, these problems revolve around the "5 elements of planning", and it would be nice to be able to clearly tell the answer to the business question through labels (rather than sparse data) (as shown in the figure below).
The need for clear answers to questions. Frontline workers are much more aware of customer behavior and needs than data analysts who are thousands of miles away.
What frontline workers need is not: you teach me how to do it (and I can't teach it). But:
Like what:
When a customer consults a product, I can quickly find out the information
With the incentive policy, I can quickly find out how much I've achieved
There is an event going live, and I can quickly find out which guests can participate
Such clear guidance is the best tool.
After carefully understanding the business needs, you will find that there is no need for large-scale labeling. It is not necessary to use a large area of labels for business! The key to the success of the project is to provide fewer but more precise labels, cultivate business usage habits, and gradually build a complete system.
So, where to start?
Note that the difficulty of implementing the above three types of requirements is different.
The easiest thing to achieve is the need for front-line personnel. Theoretically, as long as the activities, commodities, and articles that are frequently queried by the front line are labeled according to the standard format and put them into the library (as shown in the figure below).
But! This does not meet the needs of the frontline. Because searching for information on the front line is inherently difficult. For example, there are 30 activities launched at the same time this month, and there may be two or three of the most popular ones that are interested in the front line. And the two or three most popular ones, front-line personnel and customers, often give them nicknames, resulting in bizarre search keywords. If you directly open a tag library query, the usage rate is often low and the search accuracy is low.
Therefore, the tools provided to the front line can be further optimized:
In this way, the frequency of label use can be increased, and there is an opportunity to drive front-line efficiency.
The second type of easy to push is the label that identifies the value.
First, the definition of value is relatively simple, clear, and easy to do.
Second, the management of the value tag often looks at it and can brush the sense of existence.
Third, the fluctuation of daily diagnostic indicators can be used, and the appearance rate is high.
Even if you don't do anything else, you have to prioritize the value of these signs.
Common ones, such as:
(as shown below)
(as shown below)
The only challenge here is to popularize the concept in management. It is very likely that the company has not done similar labels before, and there is no consensus within the management on "what is a high-value user" and "what is a high-quality channel", so it may be difficult to mention it for the first time. However, as long as it is not so ignorant that it doesn't even know what its own products, channels, and users look like, they can gradually promote the application of labels. After all, reducing the pressure of reading reports and focusing on core issues is the common appeal of everyone.
Of the three types of requirements, the most difficult to meet is the needs of the operations department. "Likes" and "Preferences" tags are very difficult to make.
Not to mention, even if it works, what proportion of advertising copy, promotional offers, and user needs will be accounted for......
Therefore, if you want to do this clearly, you must need to iterate many times.
The way of iteration is to move from more data to less data. Like what:
These extreme groups are generally large contributors to performance, and there is a lot of data, so it is easy to summarize the rules. And when high consumption does not consume and high activity does not convert, the business department will be in a hurry to find a way, and it can further verify the accuracy of prediction in combination with business actions.
As for users with little data, they can first fry fish according to the fixed recommended route (as shown in the figure below) combined with business actions to test user needs and gradually improve the accuracy of prediction.
Labeling is crucial, it is an important tool to quantify qualitative factors and provide value judgments, and it is a very basic construction. However, to do a label project, it must be combined with business analysis (for management), activity support (for operation), and system tools (for the front line), and you can't dedicate yourself in obscurity. Otherwise, everyone thinks that you can refine a furnace of elixir in a muffled voice, and if you don't participate in the process and don't use it, you will definitely be disappointed at the end.
This article was published in Operation School by the author of the operation faction [Down-to-earth Teacher Chen], WeChat public account: [Down-to-earth Teacher Chen], original/authorized Published in Operation School, without permission, it is forbidden to reprint.
The title image is from Unsplash and is licensed under CC0.