

Twitter patents and other publications reveal likely aspects of how tweets become promoted in the timeline feeds of users.
Some of Twitter’s timeline ranking factors are very surprising, and adjusting your approach to Tweeting may help you to gain greater visibility of your Tweets.
Based upon a number of key patents and other sources, I have outlined a number of probable ranking factors for Twitter’s algorithm herein.
Twitter first began using an algorithm-based timeline back in 2016 when it switched from what was purely a chronological feed of Tweets from all the accounts one followed. The change ranked users’ timelines to allow them to see “the best Tweets first.” Twitter has since experimented with variations of this up to the present.
A feed-based algorithm for social media is not unusual. Facebook and other social media platforms have done the same.
The reasons for this change to an algorithmic mix of timeline Tweets are pretty clear. A purely personal, chronological timeline composed of only the accounts one has followed is very siloed and therefore limited – while introducing posts from accounts beyond one’s direct connections has the potential to increase the time one spends on the platform, which in turn increases overall stickiness, which in turn increases the worth of the service to advertisers and data partners.
Various interest classifications of users and interest topics associated with their accounts and tweets further enables potential for advertisement targeting based upon user demographics and content topics.
Twitter power users may have developed some intuitions about various Tweet factors that can result in greater visibility within the algorithm.
Corporations register patents all the time for inventions that they do not actually use in live service. When I worked at Verizon, I personally wrote a number of patent drafts for various inventions that my colleagues and I developed in the course of our work – including things that we did not end up using in production.
So, the fact that Twitter has patents that mention ideas for how things could work does not at all guarantee that that is how things do work.
Also, patents typically contain multiple embodiments, which are essentially various ways in which an invention could be implemented – patents attempt to describe the key elements of an invention as broadly as possible in order to claim any possible use that could be attributed to it.
Finally, just as with the famous PageRank algorithm patent that was the foundation of Google’s search engine, in instances where Twitter has used an embodiment from one of their patents, it is highly likely that they have changed and refined the simple, broad inventions described, and will continue to do so.
Even despite all this typical vagueness and uncertainty, I found a number of very interesting concepts in the Twitter patent descriptions, many of which are highly likely to be incorporated within their system.
One additional caveat before I proceed involves how Twitter’s timeline algorithm has incorporated Deep Learning into its DNA, coupled with various levels of human supervision, making it a frequently, if not constantly, self-evolving beast.
This means that both large changes and small, incremental changes, can and will be occurring in how it performs content ranking. Further, this machine learning approach can lead to conditions where Twitter’s own human engineers may not directly know precisely why some content is displayed or outranks other content due to the abstraction of ranking models produced, similar to what I described when writing about models produced by Google’s quality ranking through machine learning.
Despite the complexity and sophistication of how Twitter’s algorithm is functioning, understanding the factors that likely go into the black box can still reveal what influences rankings.
Twitter’s original timeline was simply composed of all the Tweets from the accounts one has followed since one’s last visit, which were collected and displayed in reverse-chronological order with the most recent Tweets shown first, and each earlier Tweet shown one after another as one scrolled downward.
The current algorithm is still largely composed of that same reverse-chronological listing of Tweets, but Twitter performs a re-ranking to try to display the most-interesting Tweets first and foremost out of recent Tweets.
In the background, the Tweets have been assigned a ranking score by a relevance model that predicts how interesting each Tweet is likely to be to you, and this score value dictates the ranking order.
The Tweets with highest scores are shown first in your timeline list, with the remainder of most-recent Tweets shown further down. It is notable that interspersed in your timeline are now also Tweets from accounts you are not following, as well as a few advertisement Tweets.
First of all, one of the most influential aspects of the Twitter timeline is how Twitter is now displaying Tweets based upon not only your direct connections at this point, but essentially what is your unique social graph, which Twitter refers to in patents as a “connection graph”.
The connection graph represents accounts as nodes and relationships as lines (“edges”) connecting one or more nodes. A relationship may refer to associations between Twitter accounts.
For example, following, subscribing (such as via Twitter’s Super Follows program or, potentially, for Twitter’s announced subscription feature for keyword queries), liking, tagging, etc. – all of these create relationships.
Relationships in one’s connection graph may be unidirectional (e.g., I follow you) or bidirectional (e.g., we both follow each other). If I follow you, but you do not follow me, I would have a greater expectation of seeing your Tweets and Retweets appearing in my timeline, but you would not necessarily expect to see mine.
Simply based on the connection graph, you are likely to see Tweets and Retweets from those you have followed, as well as Tweets your connections have Liked or Replied to.
The Twitter algorithm has expanded Tweets you may see beyond those accounts that you have directly interacted-with. The Tweets you may see in your timeline now also include Tweets from others who are posting about topics you have followed, Tweets similar in some ways to Tweets you have previously Liked, and Tweets based on topics that the algorithm predicts you might like.
Even among these expanded types of Tweets you may get, the algorithm’s ranking system applies – you are not receiving all Tweets matching your topics, likes, and predicted interests – you are receiving a list curated through Twitter’s algorithm.
Within the DNA of a number of Twitter’s patents and algorithm for ranking Tweets is the concept of “interestingness.”
This was quite likely inspired by a patent granted to Yahoo In 2006 called “Interestingness ranking of media objects”, which described the ranking methods used in the algorithm for Flickr (the dominant social media photo-sharing service that has been subsequently eclipsed by Instagram and Pinterest).
That earlier algorithm for Flickr bears a great many similarities to Twitter’s contemporary patents. It used similar and even identical factors for computing interestingness. These included:
One could easily describe Twitter’s algorithm as taking the Flickr interestingness algorithm, expanding upon some of the factors involved, computing it through a more sophisticated machine learning process, interpreting content based upon natural language processing (NLP), and incorporating a number of additional variations to enable rapidity for presentation in near real-time for a gargantuan number of users simultaneously.
It is also of interest to focus some on methods used by Twitter to detect spam, spam user accounts, and to demote or suppress spam Tweets from view.
The policing for disinformation, other policy-violating content, and harassment is likewise intense, but that does not necessarily converge as much with ranking evaluations.
Some of the spam detection patents are interesting because I see users frequently running aground of Twitter’s spam suppression processes quite unintentionally, and there are a number of things one may do that result in sandbagging efforts to promote and interact with Twitter’s audience. Twitter has had to build aggressive watchdog processes to police and remove spam, and even the most prominent users can run afoul of these processes from time to time.
Thus, an understanding of Twitter’s spam factors can be important as they can cause one’s Tweets to get deductions from interestingness they would otherwise have, and this loss in the relevancy scores can reduce the visibility and distribution power of your Tweets.
So, what are the factors mentioned in Twitter’s patents for assessing “interest”, and which influence how Twitter scores Tweets for rankings?
With more recent being generally much more preferred. Aside from specific keyword and other types of searches, most Tweets would be from the last few hours. Some “in case you missed it” Tweets may also be included, which appear to range primarily over the last day or two.
In general, in general, Google and other platforms have indicated that users tend to prefer images and video media more, so a Tweet containing either might get a higher score.
Twitter specifically cites image and video cards, which refers to websites that have implemented Twitter Cards, which enables Twitter to easily display richer preview snippets when Tweets contain links to webpages with the card markup.
Tweets with links that show images and video are generally more engaging to users, but there may be an additional advantage for Tweets linking to the pages with the card markup for displaying the card content
Twitter cites Likes and Retweets, but additional metrics related to the Tweet would also potentially apply here. Interactions include:
While most impressions come from the display of the Tweet in timelines, some impressions are derived when Tweets are shared through embedding in webpages. It is possible that those impressions numbers might also affect the interestingness score for the Tweet.
One Twitter patent describes computing a score for a Tweet representing how likely it is that followers of the Tweet’s Author in the social messaging system will interact with the message, the score being based on the computed interaction level deviation between the observed interaction level of Followers of the Author and the expected interaction level of the Followers.
One type of classification is the length of the text contained in the Tweet, which could be classified as a numerical value (e.g. 103 characters), or it could be designated as one of a few categories (e.g., short, medium, or long).
According to topics involved with a Tweet, it might be assessed to be more or less interesting – for some topics, short might be more beneficial, and for some other topics, medium or long length might make the Tweet more interesting.
Past interactions with the author of a Tweet will increase the likelihood (and ranking score in one’s timeline) that one will see other Tweets by that same author.
These social graph interaction metrics can include scoring by the origin of the relationship.
So, a past history of replying-to, liking, or Retweeting an author’s Tweets, even if one does not follow that account, can increase the likelihood one will see their latest Tweets.
There is a likelihood that the recent of one’s interactions with a Tweet author may also factor into this, so if you have not interacted with one of their Tweets for a long time, potential visibility of their newer Tweets may decrease for you.
In the context of the algorithm, “author” and “account” are essentially used to mean the same thing, so Tweets from a corporate account are treated the same as Tweets from an individual.
This score can be calculated by an author’s relationships and interactions with other users.
The example given in the patent is that an author followed by multiple high profile or prolific accounts would have a high credibility score.
While one rating value cited is “low”, “medium”, and “high”, the patent also suggests a scale of rating values from 1 to 10, and it can include a qualitative and/or quantitative factor.
I would guess that a range like 1 to 10 is much more likely. It seems likely that some of the spam assessment values could be used to subtract from an Author Credibility Rating. More on potential spam assessment factors in the latter portion of this article.
It is possible that authors that are assessed to be more relevant for a particular topic may have a higher Author Relevancy value. Also, mentions of an Author may make them more relevant in the context of the Tweets mentioning them.
The patents also speak about associating Authors with topics, so it is possible that Authors that Tweet involving specific topics on a frequent basis, along with good engagement rates, may be deemed to have higher relevancy when their Tweets involve that topic.
Tweets may be classified based on properties of the Author. These metrics may influence the relative interestingness of the Author’s messages. Such Author Metrics include:
Tweets get classified according to the topics they involve. There are some very sophisticated algorithms involved in classifying the Tweets.
Twitter users often have selected topics to be associated with their accounts, and you will obviously be shown popular Tweets from the topics you have selected. But, Twitter also automatically creates topics based off of keywords found in Tweets.
Based on your interactions with Tweets and the accounts you follow, Twitter is also predicting topics that you would likely be interested in, and showing you some Tweets from those topics despite you not formally subscribing to the topics.
Twitter’s system is highly complex, and allows custom ranking models to potentially be applied to Tweets for particular topics and when particular phrases are present.
Twitter has a large staff that works to develop models for particular “customer journeys”, and this would appear to coincide with patent descriptions of how editors could set rules on topic-oriented posts and keywords or phrases in posts.
For instance, posts containing text about “hiring now” or “will be on TV” might be considered boring for a topic, while phrases like “fresh”, “on sale”, or “today only” might be given greater weight as they could be predicted to be more interesting.
This could be quite difficult to cater to, as there is a huge field of potential topics and custom weightings that could be applied.
One recent job posting at Twitter for a Staff Product Designer, Customer Journey described how the position would help:
“Whether you’re looking for Ariana Grande fanart, #herpetology, or extreme unicycling, it’s all happening on Twitter. Our team is responsible for helping new members navigate the diverse array of public conversations happening on Twitter and quickly find a sense of belonging…”
“Gather insights from data and qualitative research, develop hypotheses, sketch solutions with prototypes, and test ideas with our research team and in experiments.”
“Document detailed interaction models and UI specifications.”
“Experience designing for machine-learning, rich taxonomies, and / or interest graphs.”
This description sounds very similar to what’s described in Twitter’s patent for “System and method for determining relevance of social content” where:
“Editors might set rules on classifying certain phrases as more or less interesting…”
“…an editor may decide that some phrases and attributes are interesting in all content, regardless of the category of place that authors the content. For instance, the phrase ‘on sale’ or ‘event’ may be interesting in all cases and a positive weight may be applied.”
One patent describes how Tweets detected to have commercial language could be assigned a lower score than Tweets that did not have commercial language. (Contrarily, such weights could be flipped if the user was conducting searches indicating an interest in purchasing something, so that Tweets containing commercial language could be given a higher weight.)
Time of day can be used to impact relevancy. For instance, a rule could be implemented to lend more weight to Tweets mentioning “Coffee” between 8:00am to 10:00am, and/or to Tweets posted by coffee shops.
Patents describe how “place references” in Tweets could invoke greater weight for Tweets about a place, and/or to accounts associated with the place reference versus other accounts that merely mention the place. Also geographic proximity between the location of a user’s device and location associated with content items (the Tweet text, image, video, and/or Author) can increase or decrease potential relevancy.
Language of the Tweet can be classified (e.g., English, French, etc.).
The language may be determined automatically using various automated language assessment tools.
A Tweet in a particular language would be of more interest to speakers of the language and of less interest to others.
Tweets can be classified based on whether they are replies to previous Tweets. A Tweet that is a reply to a previous Tweet may be deemed less interesting than a Tweet concerning a new topic.
In one patent description, the topic of a Tweet could determine whether the Tweet will be designated to be displayed to another account or included in other accounts’ message streams.
When you are viewing your timeline, there are instances where some of a Tweet’s replies are also displayed with the main Tweet – such as when the Reply Tweets are posted by accounts you follow. In most cases, the Reply Tweets will be only viewable when one clicks to view the thread, or click the Tweet to view all the Replies.
This is an odd concept, that I believe might not be in production.
Twitter describes Blessed Accounts as being identified within a particular conversation’s graph, where the original Author in a conversation would be deemed “blessed”, and out of the subsequent replies to the original post, any of the Replies that is subsequently replied-to by the blessed account becomes “blessed” as well.
Those Tweets posted by Blessed Accounts in the conversation would be given increased relevance scores.
This is not mentioned in Twitter patents, but it makes too much sense in context of all the other factors they have mentioned to pass up.
A lot of major content websites frequently have their links shared on Twitter, and Twitter could easily create a website profile reputation/popularity score that also could factor into the rankings of Tweets when links to content on the websites is posted.
News sites, information resources, entertainment sites – all of these could have scores developed from the same factors used to assess Twitter accounts. Tweets from better-liked and better-engaged-with websites could be given greater weight than relatively unknown and less-interacted-with websites.
Yes, if you suspected the blue badge next to usernames conveys preferential treatment, there is specific verbiage in one of Twitter’s patents that confirms they have at least considered this.
Since Verified accounts often already have various other popularity indicators associated with them, it is not readily apparent if this factor is in-use or not. Tweets posted by an account that is Verified may be given a higher relevance score, enabling them to appear more than unverified accounts’ Tweets.
Here is the patent description:
“In one or more embodiments of the invention, the conversation module (120) includes functionality to apply a relevance filter to increase the relevance scores of one or more authoring accounts of the conversation graph which are identified in a whitelist of verified accounts. For example, the whitelist of verified accounts can be a list of accounts which are high-profile accounts which are susceptible to impersonation. In this example, celebrity and business accounts would be verified by the messaging platform (100) in order to notify users of the messaging platform (100) that the accounts are authentic. In one or more embodiments of the invention, the conversation module (120) is configured to increase the relevance scores of verified authoring accounts by a predefined amount/percentage.”
This is a binary flag indicating whether the Tweet has been identified as containing a topic that was trending at the time the message was broadcasted.
Twitter may be able to use an account holder’s mobile device information to infer Gender of the account holder, or infer interests in topics such as News, Sports, Weight Training, and other topics.
Some mobile devices provide information upon other apps loaded on the phone for purposes of diagnosing potential application programming conflicts. Thus, some Tweets matching your Gender, Sexual Orientation, and Topical Interests could be given more interestingness points simply based upon inferences made from your phone’s apps. (See: https://screenrant.com/android-apps-collecting-app-data/ )
Twitter states that:
“Our list of considered features and their varied interactions keeps growing, informing our models of ever more nuanced behavior patterns.”
So this list of factors is likely something of an underrepresentation of the factors they may be using, and their list may be expanding.
Also imagine that a custom combination of some of the above factors may be applied as models for Tweets associated with particular topics, lending a large potential complexity to rankings through machine learning methods. (Again, the machine learning applied to create rank weighting models custom to particular queries or topics is very similar to methods that are likely in use with Google.)
Twitter has stated that the scoring of Tweets happens each time one visits Twitter, and each time one refreshes their timeline. Considering some of the complex factors involved, that is very fast!
Twitter uses A/B testing of weightings of ranking factors, and other algorithm alterations, and determines whether a proposed change is an improvement based on engagement and time viewing/interacting with a Tweet. This is used to train ranking models.
The involvement of machine learning in this process suggests that ranking models could be produced for many specific scenarios, and potentially specific to particular topics and types of users. Once developed, the model can get tested, and if it improves engagement, it can get rapidly rolled-out to all users.
There are a lot of inferences that can be drawn from the list of potential ranking factors, and which can be used by marketers in order to improve their Tweeting tactics.
A Twitter account that only posts announcements about its products and promotional information about its company will likely not have as much visibility as accounts that are more interactive with their community, because interactions produce more ranking signals and potential benefits.
Social media experts have long recommended an approach of blending types of posts rather than merely publishing self-referential promotion – these strategies include “The Rule of Thirds”, “The 80/20 Rule”, and others.
The Twitter ranking factors likely support these theories, as eliciting more interactions with numbers of Twitter users is likelier to increase an account’s visibility.
For instance, a large company account with many followers could post an interesting poll to get advice on what features to add to its product. The votes and comments posted by users will make it such that the respondents will be much more likely to see the company’s next posting due to the recent interactions, and that next posting could be promoting or announcing something new. And, the respondents’ followers might also be more likely to see the company’s next posting, since Twitter appears to factor-in that users with similar interests may be more open to seeing content matching their interests.
Also, the factors suggest a number of potentially beneficial approaches.
When posting a Tweet promoting a product or making an announcement, combining something to elicit a response from one’s followers could easily expand exposure on the platform as each respondent’s replies to your Tweet may increase the odds that their direct followers may see the original Tweet and their connection’s reply Tweet.
Leveraging the social graph aspect of Twitter’s algorithm can help to increase the interestingness of your Tweets, and can increase exposure of your Tweets for other users.
Spam detection algorithms can negatively impact Tweet ranking ability.
For one thing, Twitter is very fast to suspend accounts that are blatantly spamming, and in cases where it is obvious and unequivocal, one can expect the account to get terminated abruptly, causing all of its Tweets to disappear from conversation graphs and timelines, and causing the account profile to be no longer available to view.
In yet other instances where it is not as clear whether an account is spamming, the account’s Tweets could simply be demoted by application of negative rank weight scores, or the Tweets could get locked or suspended until or if the account holder takes a corrective action or verifies their identity.
For example, a Twitter account with a long history of good Tweets might abruptly begin posting Viagra ads or links to malware, such as if an established account became hacked. Twitter might temporarily suspend the account until corrective actions were taken, such as passing a CAPTCHA verification, or receiving a verification code via cellphone and changing passwords. Another example could be a new user that accidentally passes over some threshold of following too many accounts within a short timeframe, or posting a little too frequently.
Twitter employs a number of methods for detecting spam and sidelining it so users see it less.
Much of the automated detecting relies upon detecting a combination of account profile characteristics, account Tweeting behaviors, and content found in the account’s Tweets.
Twitter has developed numbers of characteristic spam “fingerprints” in order to perform rapid pattern detection. One Twitter patent describes how:
“Spam is determined by comparing characteristics of identified spam accounts, and building a ‘similarity graph’ that can be compared with other accounts suspected of spam.”
Tweets identified as potentially containing spam could be flagged with a binary value like “yes” or “no”, and then Tweets that are flagged can get filtered out of timelines.
It is equally possible for there to be a scale of spamminess, computed from multiple factors, and once a Tweet or account surpasses a threshold, it then suffers demotion. I think it is worthwhile to include mention of these as Twitter users may not understand the implications of how the use the platform. For example, posting one overly-aggressive Tweet might negatively impact an account’s subsequent Tweets for some period of time. Repeated edgy behavior could result in worse, such as complete account deletion, with no opportunity to recover.
I will add a few factors here that are not specifically mentioned in Twitter patents or blog posts because Twitter does not reveal all spam identification factors for obvious reasons. But, some spam and spam account characteristics seem so obvious that I am adding a few from personal observations or from well-regarded research sources to provide a wider understanding of what can incur spam demotions.
Simply listing out spam identification factors sharply understates Twitter’s sophisticated systems used for spam identification and spam management.
Major Silicon Valley tech companies have often fought spam for years now, and it has been described as a sort of arms race.
The tech company will create a method to detect the spam, and the spammers then evolve their processes to elude detection, and then the cycle repeats again, and again.
Twitter’s patents illustrate a huge sophistication in terms of employing components of Artificial Intelligence, social graph analysis, and methods that combine synchronous and asynchronous processing in order to deliver content extremely rapidly.
The AI components include:
As the ranking determinations can be based upon unique, abstracted, machine learning models according to specific phrases, topics, and interest profiling, what works for one area of interest may work a little differently for other areas of interest.
Even so, I think that looking at these many potential ranking factors that have been described in Twitter patents can be useful for marketers who want to attain greater exposure on Twitter’s platform.
I served this year as an expert witness in arbitration between a company that sued Twitter for unfair trade practices, and the case was amicably settled recently.
As an expert witness, I am often privy to secret information, including private communications such as employee emails within major corporations, as well as other key documents that can include data, reports, presentations, employee depositions and other information.
In such cases, I am bound by legal protective orders and agreements not to disclose information that was revealed to me in order to be sufficiently informed on the matters I am asked to opine upon, and this was no exception.
I have not disclosed any information covered by the protective order in this article from my recently-resolved case.
I have gained a greater understanding and insights into some aspects of how Twitter functions from context, observations of Twitter in public use, logical projections based on their various algorithm descriptions and from reading Twitter’s patents and other public disclosures subsequent to the resolution of the case I served upon, including the following sources:
The post Twitter’s algorithm ranking factors: A definitive guide appeared first on Search Engine Land.
Source: searchengineland