terriko: (Default)
[personal profile] terriko
Short version:

I'd like some help figuring out why RSS feeds that include iPython notebook contents (or more specifically, the CSS from iPython notebooks) are showing up as really messed up in the PythonPython blog aggregator. See the Python summer of code aggregator and search for a MNE-Python post to see an example of what's going wrong.

Bigger context:

One of the things we ask of Python's Google Summer of Code students is regular blog posts. This is a way of encouraging them to be public about their discoveries and share their process and thoughts with the wider Python community. It's also very helpful to me as an org admin, since it makes it easier for me to share and promote the students' work. It also helps me keep track of everyone's projects without burning myself out trying to keep up with a huge number of mailing lists for each "sub-org" under the Python umbrella. Python sponsors not only students to work on the language itself, but also for projects that make heavy use of Python. In 2014, we have around 20 sub-orgs, so that's a lot of mailing lists!

One of the tools I use is PythonPython, software often used for making free software "planets" or blog aggregators. It's easy to use and run, and while it's old, it doesn't require me to install and run an entire larger framework which I would then have to keep up to date. It's basically making a static page using a shell script run by a cron job. From a security perspective, all I have to worry about is that my students will post something terrible that then gets aggregated, but I'd have to worry about that no matter what blogroll software I used.

But for some reason, this year we've had some problems with some feeds, and it *looks* like the problem is specifically that PlanetPlanet can't handle iPython notebook formatted stuff in a blog post. This is pretty awkward, as iPython notebook is an awesome tool that I think we should be encouraging students to use for experimenting in Python, and it really irks me that it's not working. It looks like Chrome and Firefox parse the feed reasonably, which makes me think that somehow PlanetPlanet is the thing that's losing a <style> tag somewhere. The blogs in question seem to be on blogger, so it's also possible that it's google that's munging the stylesheet in a way that planetplanet doesn't parse.

I don't suppose this bug sounds familiar to anyone? I did some quick googling, but unfortunately the terms are all sufficiently popular when used together that I didn't find any reference to this bug. I was hoping for a quick fix from someone else, but I don't mind hacking PlanetPlanet myself if that's what it takes.

Anyone got a suggestion of where to start on a fix?

Edit: Just because I saw someone linking this on twitter, I'll update in the main post: tried Mary's suggestion of Planet Venus (see comments below) out on Monday and it seems to have done the trick, so hurrah!

Date: May 31st, 2014 09:11 am (UTC)
From: [personal profile] puzzlement
When you say PlanetPlanet, do you mean code from http://planetplanet.org/ or http://intertwingly.net/code/venus/ ?

The latter is a fork of the former with unit tests, and an active maintainer. The config files should work across a switch. I know "switch tools" is frustrating tech advice, but having been elbow deep in these code bases a few times, I recommend anyone still using the Planetplanet code base switch to Venus.

Sam Ruby is I think responsive to requests for help on
http://lists.planetplanet.org/mailman/listinfo/devel too. The Planetplanet folks haven't been for seven years or so.

Date: May 31st, 2014 09:13 am (UTC)
From: [personal profile] puzzlement
And in the event that Venus still doesn't parse it, it's likely to be a problem in feedparser, I'd guess, so that's where I'd poke with regards to fixing it. Maybe an issue of supposedly "risky" tags being stripped?
Edited Date: May 31st, 2014 09:40 am (UTC)

Date: June 7th, 2014 08:51 am (UTC)
From: [personal profile] puzzlement
Planet would be using a version of feedparser that's about seven years old, Venus six months to twelve months old (edit: looks like about three years old! but still). feedparser does a lot of its heavy lifting, so I'd guess that's the fix.

Glad I could help! Hope SoC goes well again.
Edited Date: June 7th, 2014 08:53 am (UTC)


terriko: (Default)

April 2019

 1 23456
7 8910111213

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Apr. 24th, 2019 04:47 pm
Powered by Dreamwidth Studios