Friday, May 23, 2008

Generating Google Base XML files in VB.NET

This may be a bit obscure, but since I struggled with this, I'm sure someone else in the future will struggle with it too. Hopefully this little post will help.

Goal: Create an XML file in RSS 2.0 format using VB.NET for use with Google Base

Challenge: Getting the .NET XmlSerializer object to generate the XML for you with the correct "g:" prefixes on the Google specific tags.

Making XmlSerializer generate basic XML is rather trivial. A quick search of the Interwebs will show you numerous examples. Essentially you just generate a class that matches the basic layout of the XML you want, populate the class, and send XmlSerializer off to covert it to XML.

Getting it to generate the Google specific XML needed for Google Base is a much more difficult problem, however. It becomes a bit tricky when you need some of the xml tags to be generic (for example, <title>), and others to have the Google prefix (ala <g:price>)

Solution: The key lies in a few hints you include in the Class definition to tell XmlSerializer what to do. Here's an example of the class I used:


Imports System.Xml.Serialization

<XmlRoot(ElementName:="rss", Namespace:="")> _
Public Class rss
<XmlElementAttribute("channel")> _
Public channel As New rssChannel()
<XmlAttributeAttribute("version")> _
Public version As String = "2.0"
End Class

Public Class rssChannel
Public title As String
Public link As String
Public description As String
Public language As String = "en-us"
<XmlElementAttribute("item")> _
Public item As New rssChannelItems()
End Class

Public Class rssChannelItems
Inherits CollectionBase

Public Sub Add(ByVal Item As rssChannelItem)
Dim I As Integer = List.Add(Item)
End Sub

Default Public ReadOnly Property Item(ByVal Index As Integer) As rssChannelItem
Return CType(List.Item(Index), rssChannelItem)
End Get
End Property
End Class

Public Class rssChannelItem
' Required
Public id As String
Public title As String
Public description As String
Public link As String
<XmlElement(Namespace:="")> _
Public price As Double

' Recommended
<XmlElement(Namespace:="")> _
Public provider_class As String
<XmlElement(Namespace:="")> _
Public year As Integer

' Optional
<XmlElement(Namespace:="")> _
Public agent As String
<XmlElement(Namespace:="")> _
Public area As String ' Sq Ft
<XmlElement(Namespace:="")> _
Public lot_size As String
<XmlElement(Namespace:="")> _
Public property_taxes As Double
<XmlElement(Namespace:="")> _
Public school_district As String
<XmlElement(Namespace:="")> _
Public zoning As String
End Class

Notice this part of the code:

<XmlElement(Namespace:="")> _

It is telling the XmlSerializer that I want the element to have the g: prefix. That's the revelation. Once you have that in place, you just create an instance of the class, populate it, and save the XML to disk.


Sub GenerateGoogleBase()
Dim rssFeed As rss = Nothing
Dim rssItem As rssChannelItem

rssFeed = New rss

.title = "one"
.description = "two"
.link = "three"
End With

rssItem = new rssChannelItem = "1234"

Dim xml As New XmlSerializer(GetType(rss), "http://")
Dim strFile As New FileStream("C:\Test.xml", FileMode.Create)

Dim xmlns As New XmlSerializerNamespaces()
xmlns.Add("g", "")

xml.Serialize(strFile, rssFeed, xmlns)

Catch ex As Exception
End Try
End Sub

Wrap Up: As I said up front, yeah, this is a bit random and obscure. I'm sure I won't be the only person who struggles with this though. My hope is that the next person down the line will find this post and be helped by my struggle.

Big Thanks to Phil Weber for the basic RSS code, and Martin Honnen for helping me with making it work with Google Base.