Letterhead automatically detects duplicate items in your RSS feed to ensure that your curated content remains unique and free from redundant entries. Understanding how duplicates are identified can help you manage your feed effectively.
How We Identify Duplicate Items
When pulling content from an RSS feed, we determine duplicates based on the following key attributes:
- originalUrl: The direct link to the curation item in the RSS feed. If this URL has already been processed, the item may be flagged as a duplicate.
- channelId: This identifier remains the same for all items pulled from a particular channel's RSS feed.
- feedUniqueId: This remains the same for all items sourced from the same RSS feed, helping us track content origins.
- type: A value of 4 indicates that the item was automatically pulled from an RSS feed.
By checking these attributes, our system ensures that identical items are not duplicated within a single RSS feed or channel.
Best Practices to Avoid Duplicate Content Issues
To maintain the integrity of your RSS feed and avoid unintentional duplication, we recommend the following best practices:
- Use Unique URLs: Our system primarily detects duplicates based on originalUrl. If the same content appears under different URLs, they will be treated as distinct items. To prevent redundancy, ensure that each unique content piece has a consistent URL.
- Monitor Your Feed Source: If your feed pulls from multiple sources, ensure that duplicate content isn’t appearing under different feeds.
- Standardize Content: If you frequently post similar content, consider structuring your URLs and metadata consistently to help our system correctly identify duplicates.
- Be Aware of Redirects in RSS Feed URLs: If your feed includes URLs that redirect to another destination, our system may treat them as unique even if they point to the same content. To avoid duplication, ensure that your feed consistently uses direct URLs rather than ones with multiple redirections.