1

Resolved

Parsing Error on some feed

description

comments

Kostik wrote Oct 8, 2007 at 11:01 AM

The feedhttp://www.actu-mmorpg.com/index2.php?option=com_rss&feed=RSS2.0&no_html=1requires no UserAgent-header OR requires both UserAgent+Accept headers. Please check the updated FeedReaderSettings class (FeedDotNet ver. 1.2.1.1). It now has 3 new members:string HttpUserAgentStringstring HttpAcceptStringDictionary<HttpRequestHeader, string> HttpHeaders

Defaults of HttpUserAgentString and HttpAcceptString are set to the browser defaults, so that we can simulate a real browser agent (some feeds, thereunder http://www.actu-mmorpg.com/index2.php?option=com_rss&feed=RSS2.0&no_html=1 require this behavior). If you are still in situations where your browser (or other feed agent/library) shows a feed as expected, but the FeedDotNet doesn't, you can experiment with adding of custom Http-headers to the HttpHeaders-collection.



Not valid XML document: xml declaration not at start of external entity:http://www.motomag.com/spip/backend-breves.php3?id_rubrique=1

No errors found:http://www.mondesvirtuels.info/backend.php?op=RSS2.0http://flux.jeuxonline.info/actualites.rsshttp://www.gentoo.fr/securite/alertes-securite.rsshttp://gentoofr.org/backend.php3http://www.games4linux.eu/spip.php?page=backend&id_rubrique=1http://www.clubic.com/xml/demo.xmlhttp://www.debianaddict.org/backend.php3http://www.abondance.com/rss/rss.xmlhttp://www.cafebabel.com/fr/babel_rss.xmlhttp://www.tf1.fr/xml/rss/0,,9,00.xmlhttp://archlinux.fr/index.php?option=com_rss&feed=RSS2.0&no_html=1http://www.jeuxvideo.fr/xml/tout.xmlhttp://www.actumoto.fr/feed/http://www.krissnature.net/atom.phphttp://www.latribune.fr/Cobrand/Articles.nsf/Articles.rss?OpenView&Channel=Boursehttp://www.easybourse.com/RSS/easybourse-rss-news-fr.rss

wrote Oct 8, 2007 at 11:03 AM

Kostik wrote Oct 8, 2007 at 11:04 AM

The feedhttp://www.actu-mmorpg.com/index2.php?option=com_rss&feed=RSS2.0&no_html=1requires no UserAgent-header OR requires both UserAgent+Accept headers. Please check the updated FeedReaderSettings class (FeedDotNet ver. 1.2.1.1). It now has 3 new members:string HttpUserAgentStringstring HttpAcceptStringDictionary<HttpRequestHeader, string> HttpHeaders

Defaults of HttpUserAgentString and HttpAcceptString are set to the browser defaults, so that we can simulate a real browser agent (some feeds, thereunder http://www.actu-mmorpg.com/index2.php?option=com_rss&feed=RSS2.0&no_html=1 require this behavior). If you are still in situations where your browser (or other feed agent/library) shows a feed as expected, but the FeedDotNet doesn't, you can experiment with adding of custom Http-headers to the HttpHeaders-collection.



Not valid XML document: xml declaration not at start of external entity:http://www.motomag.com/spip/backend-breves.php3?id_rubrique=1

No errors found:http://www.mondesvirtuels.info/backend.php?op=RSS2.0http://flux.jeuxonline.info/actualites.rsshttp://www.gentoo.fr/securite/alertes-securite.rsshttp://gentoofr.org/backend.php3http://www.games4linux.eu/spip.php?page=backend&id_rubrique=1http://www.clubic.com/xml/demo.xmlhttp://www.debianaddict.org/backend.php3http://www.abondance.com/rss/rss.xmlhttp://www.cafebabel.com/fr/babel_rss.xmlhttp://www.tf1.fr/xml/rss/0,,9,00.xmlhttp://archlinux.fr/index.php?option=com_rss&feed=RSS2.0&no_html=1http://www.jeuxvideo.fr/xml/tout.xmlhttp://www.actumoto.fr/feed/http://www.krissnature.net/atom.phphttp://www.latribune.fr/Cobrand/Articles.nsf/Articles.rss?OpenView&Channel=Boursehttp://www.easybourse.com/RSS/easybourse-rss-news-fr.rss

ncornu wrote Oct 9, 2007 at 9:27 PM

Yes i tried with a different UserAgent without success.
Thanks for the new method.

I use FeedDotNet on linux using mono 1.2.5.1, i get errors. I will update mono when new release come out. With other Feed library i had less errors.
Extract of the Error log:

2007-10-09 23:00:33,333 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.jeuxvideo.fr/xml/tout.xml System.UriFormatException: Invalid URI: The format of the URI could not be determined.
at System.Uri..ctor (System.String uriString, UriKind uriKind) [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:33,849 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.actumoto.fr/feed/ System.NotImplementedException: The requested feature is not implemented.
at System.UriParser.IsWellFormedOriginalString (System.Uri uri) [0x00000]
at System.Uri.IsWellFormedOriginalString () [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:34,108 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.actu-mmorpg.com/index2.php?option=com_rss&feed=RSS2.0&no_html=1 System.UriFormatException: Invalid URI: The format of the URI could not be determined.
at System.Uri..ctor (System.String uriString, UriKind uriKind) [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:34,125 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.krissnature.net/atom.php System.NotImplementedException: The requested feature is not implemented.
at System.UriParser.IsWellFormedOriginalString (System.Uri uri) [0x00000]
at System.Uri.IsWellFormedOriginalString () [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.AtomParser.readEntry (System.Xml.XmlReader subReader) [0x00000]
at FeedDotNet.AtomParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:41,645 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.latribune.fr/Cobrand/Articles.nsf/Articles.rss?OpenView&Channel=Bourse System.UriFormatException: Invalid URI: The format of the URI could not be determined.
at System.Uri..ctor (System.String uriString, UriKind uriKind) [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:41,778 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.easybourse.com/RSS/easybourse-rss-news-fr.rss System.UriFormatException: Invalid URI: The format of the URI could not be determined.
at System.Uri..ctor (System.String uriString, UriKind uriKind) [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:47,949 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.dotnet-project.com/dpRss.aspx System.NullReferenceException: Object reference not set to an instance of an object
at System.Uri.IsWellFormedOriginalString () [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:48,036 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.techheadbrothers.com/rss.aspx?id=news System.NullReferenceException: Object reference not set to an instance of an object
at System.Uri.IsWellFormedOriginalString () [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]
2007-10-09 23:00:53,817 [-1210541360] ERROR FeedCrawler.Program [(null)] - Error parsing the feed, Feed url: http://www.notre-planete.info/news/news.xml System.UriFormatException: Invalid URI: The format of the URI could not be determined.
at System.Uri..ctor (System.String uriString, UriKind uriKind) [0x00000]
at System.Uri.IsWellFormedUriString (System.String uriString, UriKind uriKind) [0x00000]
at FeedDotNet.RssParser.Parse () [0x00000]
at FeedDotNet.FeedReader.Read (System.String uri, FeedDotNet.FeedReaderSettings settings) [0x00000]

I will use the new method later, ignore the feed on error that need custom http headers.

Thanks again.

wrote Feb 13, 2013 at 10:28 PM

wrote May 15, 2013 at 10:40 PM

wrote May 15, 2013 at 10:40 PM

wrote Jun 12, 2013 at 1:07 AM