Monday, January 11, 2010

5 Reasons I like ORM

ORMs (Object Relational Mapping) tools allow for easy interaction between the objects we program in and the relational databases we store data in. It is customary to create code that handles the database IO by converting between the object model and the RDBMS. This includes code to save, retrieve and update objects. As these objects become more complex more maintenance is required to keep the data access code (usually in the form of a DAO). The addition of a single field can result in modifications across various DAO methods. The issue becomes more complex as you include lists and sets inside your objects. Which need to be stored or retrieved.


Hibernate is an ORM implementation that helps simplify a great deal of the repetitive error prone coding processes involved in creating DAOs and accessing RDBMS. By keeping an XML based (or annotation based) mapping between objects and their RDBMS, Hibernate simplifies the coding of data access methods. More so, SQL is no longer used. Instead one speaks HQL, a query language based on the attributes of the objects rather than the fields of the underlying database's tables.


I'll go over five points that highlight key benefits in faster more robust development of applications. This isn't a deep tutorial into Hibernate, but rather an overview of the benefits of ORM and their particular implementation in Hibernate. The five points I'll be going over in the upcoming posts will be:


  • Part 1

    • Easy configuration. We'll see how little is needed to actually start working with Hibernate.

    • Easy to extend code. We'll see how easy it is to add extra fields and add relationships to other tables in the form of lists or sets.

  • Part 2

    • Easy to query. We'll see how easy it is to start creating queries and how little code is needed to obtain a fully populated object in our results.

    • Easy to create complex queries. We'll see how easy it is to maintain our queries as the object becomes more complex.

  • Part 3

    • Polymorphism. Hibernate supports polymorphism and we'll go over basic examples on how this is handled.


Why Hibernate? Hibernate is a robust well proven framework that works well with many databases (Oracle, DB2, Microsoft SQL Server, MySQL, PostgreSQL, etc https://www.hibernate.org/80.html) it is also also a framework that works well on Java an .NET. That makes one's expertise in Hibernate usable on two major development platforms. ORMs have been around for quite some time, but in the last couple of years I've seen it rise in popularity (and appear on more job requirements).


Ok enough intro. Lets get down to todays points.

One

Its easy

The Underlying Parts

There are three elements in Hibernate you need to keep in your mind

  • The session factory. That piece of code that actually hands over a connection to the database. It delivers a session you can then use to execute queries.

  • The Hibernate config. An XML file that holds the configuration for Hibernate to use. It contains information on how to create a connection (database type, location, username, password) and the Hibernate mapping files to the classes you wish to save and retrieve.

  • The Hibernate mapping files. A set of XML files that hold the relationship between class properties and tables. (1)

(1) Mapping files can be omitted if annotations are use in the source code. In this tutorial we'll use annotations to leverage the full benefits of ORM and minimize coding.

Even though Hibernate is complex. You can get going with understanding three files. The benefits obtained quickly overcame the amount of time spent learning it.

Starting example

For our example we want to store data for a game. In this game we have participants and their fleets. Each participant can have a set of fleets at his or her disposal.

First of all we'll setup the Participant class. We'll start with something very basic. The name and the email. We create our simple bean as shown below and then add an id that will be the primary key in the database. In this case an Integer.

Once the base Java code is set we'll add the annotations. Annotations allow us to tell Hibernate how to store and retrieve this class. It also allows us to use Hibernate to create the database structure for us. First we add the @Entity followed by the @Table annotation. This indicates that the class will be an entity to store and that it should be stored in the “participants” table. We add the @Id and @GeneratedValue before id. This tells Hibernate that id will be and identifier for the class and that its value will be generated automatically by the underlying database.

@Entity         // tells hibernate this is something it will have to manage
@Table(name="participants") // tells hibernate which table to use
public class Participant {

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Integer id;
private String name;
private String email;

public String getEmail() {
return email;
}
/** remaining getters and setters **/
}


This is the code for the Unit class.


@Entity
@Table(name="units")
public class Unit {

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Integer id;
private String name;
private Integer type;
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public Integer getType() {
return type;
}
public void setType(Integer type) {
this.type = type;
}
}


Sample .NET code


Annotations can be used in .NET as well. The following code exemplifies its usage.

Source:
http://nhibernate.sourceforge.net/NHibernateEg/NHibernateEg.Tutorial1A.html#NHibernateEg.Tutorial1A-Configuration


    [NHibernate.Mapping.Attributes.Class(Table="SimpleOrder")]
public class Order
{
}

private int _id = 0;
[NHibernate.Mapping.Attributes.Id(Name="Id")]
[NHibernate.Mapping.Attributes.Generator(1, Class="native")]
public virtual int Id
{
get { return _id; }
}

private System.DateTime _date = System.DateTime.Now;
private string _product;

[NHibernate.Mapping.Attributes.Property]
public virtual System.DateTime Date
{
get { return _date; }
}

[NHibernate.Mapping.Attributes.Property(NotNull=true)]
public virtual string Product
{
get { return _product; }
set { _product = value; }
}



Hibernate has a tool called hbm2ddl that can create the underlying database structure for you. Using this tool with the above annotated classes creates the following database
structure:














































Field



Type



Null



Default











id



int(11)



No















email



varchar(255)



Yes



NULL











name



varchar(255)



Yes



NULL

















































Field



Type



Null



Default



id



int(11)



No







name



varchar(255)



Yes



NULL



type



int(11)



Yes



NULL







Hibernate becomes aware of the classes it needs to
manage as well as other configuration parameters through the so
called hibernate configuration file. The following XML file shows the
configuration used in this example. We simply list the classes and
Hibernate handles the rest.


<!DOCTYPE hibernate-configuration PUBLIC "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">

<hibernate-configuration>
<session-factory>
<mapping class="net.blog.hibernatereasons.model.Participant"/>
<mapping class="net.blog.hibernatereasons.model.Unit"/>
</session-factory>
</hibernate-configuration>


The last bit of configuration we need is the SessionFactory. The
session factory handles the sessions to the underlying database and
requires a database connection. Which is provided by a data source.
The following XML shows a common configuration using Spring. Spring
isn't required with Hibernate, but it helps a lot. I'm using it in
this example because it helps by showing in simple XML what would
take lots of code to exemplify. Basically we end up with two objects
one of type DriverManagerDataSource and the other of type
AnnotationSessionFactoryBean (Only the AnnotationSessionFactoryBean
is used in our DAO implementation).


By looking at the XML it is easy to identify the
key configuration values used by AnnotationSesionFactoryBean:



  • One is the datasource. In this case we are
    using a simple JDBC datasource, but we could use a container
    provided resource (JNDI for example).


  • The second is the hibernate.cfg.xml file we
    saw above. It contains the classes to manage as well as other
    configuration settings for Hibernate. In this case we tell it to
    find the config file in the classpath.


  • Finally we need to tell Hibernate what
    dialect to use. In this case org.hibernate.dialect.MySQLDialect.
    This is a really strong point for ORM. Being able to change dialect
    really empowers you as a programmer. You no longer need to become
    proficient in a particular database. Not that it wouldn't hurt you,
    but consider the amount of time you'll spend debugging SQL commands
    in a database you're not that good at versus letting Hibernate deal
    with that.



<!-- Datasource configuration -->
<bean id="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName">
<value>com.mysql.jdbc.Driver</value>
</property>
<property name="url">
<value>jdbc:mysql://localhost:3306/blogSample
</value>
</property>
<property name="username">
<value>databaseuser</value>
</property>
<property name="password">
<value>secret</value>
</property>
</bean>
<!-- Hibernate SessionFactory -->
<bean id="sessionFactory"
class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="configLocation" value="classpath:hibernate.cfg.xml" />
<property name="hibernateProperties">
<value>
hibernate.dialect=org.hibernate.dialect.MySQLDialect
</value>
</property>
</bean>

An equivalent .NET configuration can be found here:
http://www.springframework.net/doc-latest/reference/html/orm.html


Two


Easy to
extend code


Adding an extra field


One key
benefit of using ORM is the speed at which you can add new features
to your code. Adding an extra field in your class is a simple matter
of adding the field code (declaration and getter/setter pair) and
adding the XML mapping to the underlying table. I'll go over three
quick examples.


Adding a simple property to the class


Lets suppose we now need to
add a nickname to the Participant class. Adding this is as simple as
declaring it.






    private String nickname;


Thats it you're done. Hibernate will scan the class and consider the
new field. If you use automatic database generation the corresponding
column will be automatically added. You don't buy it. Let me repost
the whole class just to make it clear:


@Entity         // tells hibernate this is something it will have to manage
@Table(name="participants") // tells hibernate which table to use
public class Participant {

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Integer id;
private String name;
private String email;
private String nickname;

public String getEmail() {
return email;
}
/** remaining getters and setters **/
}


Like I said. You're done. In a moment we'll go over the DAO magic
that makes this possible.


Adding a more complex class as a property





Now
imagine we want to add more information to the class. We realize we
need a telephone and probably some other contact info. We can keep
adding it as properties to the class, but that could be a mess. It's
much better to encapsulate all this in a new class called
ContactInfo. Handling this in Hibernate is pretty easy.





We
create the ContactInfo class.





@Embeddable
public class ContactInfo {
private String email;
private String telephone;
private String chat;
public String getEmail() {
return email;
}
public void setEmail(String email) {
this.email = email;
}
public String getTelephone() {
return telephone;
}
public void setTelephone(String telephone) {
this.telephone = telephone;
}
public String getChat() {
return chat;
}
public void setChat(String chat) {
this.chat = chat;
}
}


Notice how this class uses the @Embeddable annotation. This tells
Hibernate that this class isn't actually mapped to any particular
table. It will become part of a more complex class which will be then
mapped to a table.





We
remove the email property from the Participant class and add the
ContactInfo property as shown bellow.





@Entity
@Table(name="participants")
public class Participant {

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Integer id;
private String name;
private String nickname;
private ContactInfo contact;


And once again we are done.


Adding
a list of objects


Now we need to add the units that each participant
can control. To make this fun lets imagine they are fleet units like
fighters and bombers and cruisers and what not. Similar to our
Participant class we simple write it out and add the corresponding
annotations just like we did with Participant.


@Entity
@Table(name="units")
public class Unit {

public static final int TYPE_CRUISER=1;
public static final int TYPE_TRANSPORT=2;
public static final int TYPE_DESTROYER=3;
public static final int TYPE_FIGHTER=4;
public static final int TYPE_BOMBER=5;

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Integer id;
private String name;
private Integer type;
public Integer getId() {
return id;
}
/** getters and setters **/
public void setType(Integer type) {
this.type = type;
}
}


We'll start with a very basic way to distinguish one Unit from the
other. A field type will be used. In the next chapter we'll go over
polymorphism and let Hibernate handle this for us automatically.


To create the relationship between Participant and
Unit we add the following in the Participant class:


    @OneToMany(cascade=Cascade.ALL,fetch=FetchType.Eager)
@JoinColumn(name="participantId")
@OrderBy(name="unitOrder")
private List<Unit> units=new ArrayList<Unit>();


This tells Hibernate that Participant has a
list of Unit. In the background Hibernate will add a column to the
units table called participantId that indicates which Participant
said Unit belongs to. The list's order will be kept in a column
called unitOrder. Retrieving the Participant will retrieve the units
list automatically and in the order they were set when it was saved.
The OneToMany annotation has two properties set. One is the cascade,
indicating that unpersisted units must be saved when the participant
is saved and modified units updated. The fetch is set to eager
contrary to lazy. Meaning that not caching will be done in this case.


The actual DAOs


Up until now we've been
declaring some classes and adding some extra text in the form of
annotations. But how will this benefit us when we actually have to
retrieve our participants from the database. Lets go over the
implementation of our DAOs to get a quick glimpse of this. A deeper
look into queries will be covered in the second chapter.


First we create our
ParticipantDao interface for basic CRUD operations. Always a good
practice if you decide to use something different from ORM in the
future.


public interface ParticipantDao {

void save(Participant participant);
void update(Participant participant);
void delete(Participant participant);
Participant findById(Integer id);
}


Then we create our implementation of this
interface. By extending HibernateDaoSupport (a Spring class) we can
encapsulate a great deal of the session and transaction management in
one place. Our implementation is reduced to the following code:


public class ParticipantDaoImpl extends HibernateDaoSupport implements
ParticipantDao {

public void delete(Participant participant) {
getHibernateTemplate().delete(participant);
}

public Participant findById(Integer id) {
return (Participant) getHibernateTemplate().get(Participant.class, id);
}

public void save(Participant participant) {
getHibernateTemplate().update(participant);
}

public void update(Participant participant) {
getHibernateTemplate().update(participant);
}
}


Notice how none of the code here has anything
to do with the actual Participant class declaration. All the changes
we did like adding fields, moving the contact information into its
own class and adding the list of units required no modification in
the DAO.


Take a moment to consider
how work would have grown exponentially as we did the three examples.
Adding an extra property is little work. Just modify the SQL to
include the field. Adding a class to contain contact information and
moving some of the Participant properties into it would have required
a lot more work than just adding an extra property. Remember we need
to modify the insert, update and retrieve queries. It's getting more
complex by the minute.


When we get to adding the
units we can see how complex our DAO would have been if we didn't use
ORM. We'd have to get the participant (including the ContactInfo
class in it) then get the units in order and place them inside the
participant object. And we haven't even gotten to saving or updating.
Clearly the time needed for conventional coding and testing of this
code grows exponentially as the classes become more complex. Imagine
the benefits in even more complex data structures!


Conclusion


Overall ORM is a great tool
that greatly simplifies your coding. Not only in terms of what you
have to actually write out, but also in terms of time spent testing.
While it is a lot more application overhead than writing your simple
ODBC/SQL code. But it is also a lot richer in features. It will add
bloat to your application, but counter that with the benefits of a
well developed and proven cache. On one side you have more megs in
size at distribution time. On the other you have a cache that speeds
up your application. Think runtime memory footprint and database
latency.


Think down the road. How
much time are you saving when you need to make changes to your
application? How much time are you saving when you actually need to
test those changes? Experience in ORM is something you can leverage
across multiple databases. Not once did we see any database specific
code. Through the usage of the the SQL dialect one's experience with
ORM can be easily ported to multiple databases.


In my experience the time
spent learning and the initial overhead issues are quickly outweighed
by the benefits down the road. In the next chapter we'll go over
queries. We'll see how they are handled and what benefits we can reap
by using ORM instead of conventional methods.





Monday, August 31, 2009

PHP empty loop test

Empty For Loop Test Analysis


Objective

Determine the amount of time that is consumed in each part of a for statement given an empty loop. Under the understanding that a statement such as:


for($i=0;$i<$maxCount;$i++) {

}


will do nothing between the braces. It is our goal to determine the amount of time consumed in each element between the parenthesis.


Methodology

The time consumed in a loop for a given value of maxCount can easily be measured making maxCount significantly large and using the system clock to find a time difference between the time stamp prior to the execution and after the execution of the for loop. Since we can't use the system clock inside the for loop without affecting the for loop's performance we will have to use a system of equations to obtain the values.


Given that total time can be approximated by :


Ax+By+Cz=Ttotal


Where:

x : amount of time that it takes to set $i=0

y : amount of time to do the comparison $i<$maxCount

z : amount of time to do the increment $i++

A : the amount of times $i=0 is executed

B : the amount of times the comparison is made

C : the amount of times the increment is performed


We can set a system of equations like :


A1x+B1y+C1z=T1

A2x+B2y+C2z=T2

A3x+B3y+C3z=T3


By performing a set of well crafted tests that change the values for An, Bn and Cn. The system can be solved and the values of x, y and z that satisfy the equations can be obtained.


Considering that $i=0 is executed only once and that $maxCount is very large ($maxCount>>1) we can disregard the impact of Anx. Simplifying the equations to:


B1y+C1z=T1

B2y+C2z=T2


If we set the value of $maxCount=20 million we can run our first test and complete the first equation:


2E7y+2E7z=T1


To obtain our second equation we must modify the loop so there is a difference in the amount of times an operation is performed in comparison to the other. In this case we do the following:


for($i=0;$i<$maxCount;$i++) {

$i++;

$i++;

$i++;

$i++;

$i++;

$i++;

$i++;

$i++;

$i++;

}


In this example there are ten increments for every comparison. We now set the value of $maxCount to 200 million. Since there are ten increments per loop we actually perform only 20 million comparisons. And equation 2 looks like:


2E7y+2E8z=T2


With these two equations it is only a matter of running the tests to obtain the times.



Test

Doing a run that repeats each for loop 5 times and average the times we obtain:


condition increment time
Equation 1 20000000 20000000 2.49
Equation 2 20000000 200000000 11.36



Solving the equations we obtain the following values for z and y respectively.


Time per increment : 0.0000000492738678717778

Time per comparison: 0.0000000750540206172222


We can observe that it takes less time to increment that to compare. Based on these results we rewrite the for loop in the following way:


for($i=0;$maxCount-$i;$i++) {

}


Substituting the comparison with a subtraction. In the understanding that a subtraction resulting in 0 will evaluate to false and break out of the loop.


We then run the test again with this change in effect. The time to run a 20 million iteration loop is reduced from 2.49 to 1.92. Thats approximately 77% of the original time. Resulting in nearly a 25% performance boost.



Monday, March 31, 2008

Why OOXML stinks

I believe the OOXML standard has nothing to do with "Interoperability". It has more to do with not loosing clients who ask for a document standard to save their info with. They will then go to the heads of government and companies and brainwash them into believing they are actually using a standard based format. Technically they are (ISO approved one), but in truth Microsoft isn't holding a standard even with itself.

The same elements in a Word document are not represented by the same tags in a Power Point document. Thus by using OOXML not only do you have to develop a mapping between legacy formats (old Office stuff), but you also have to develop a mapping strategy between content in the same format (XML). Take for example the following text "The cow jumped over the moon". I made a text document and a presentation out of that string. One copy in Open Office using ODF and one with Office 2007 using OOXML. (I have removed the style tags for simplicity)

The text in the Writer (Open Office Word equivalent) shows up like this:

<text:p text:style-name="Standard">
The
<text:span text:style-name="T1">cow</text:span>
<text:span text:style-name="T3">jumped over the</text:span>
<text:span text:style-name="T2" />
moon
</text:p>

The same text in a bullet slide in Impress (Open Office Power Point equivalent) shows up like this:

<text:p text:style-name="P1">
<text:span text:style-name="T1">The</text:span>
<text:span text:style-name="T2">cow</text:span>
<text:span text:style-name="T3">jumped over the</text:span>
<text:span text:style-name="T4" />
<text:span text:style-name="T5">moon</text:span>
</text:p>

Pretty similar if you ask me. I even embedded the text as a simple text overlay in the slide (not as bulleted text). The entry looks something like:

<draw:object xlink:href="./Object 1" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" />

And going to look into the ./Object 1 data we have:

<text:p text:style-name="Standard">
The
<text:span text:style-name="T1">cow</text:span>
<text:span text:style-name="T3">jumped over the</text:span>
<text:span text:style-name="T2" />
moon
</text:p>

Gosh!!! Just like in Writer! Must have been because its embedded.

Now lets try it out with Office 2007. The text in Word:

- <w:p w:rsidR="008A5B24" w:rsidRDefault="008A5B24" w:rsidP="008A5B24">
- <w:pPr>
<w:pStyle w:val="NormalWeb" />
<w:spacing w:after="0" />
</w:pPr>
- <w:r>
<w:t xml:space="preserve">The</w:t>
</w:r>
- <w:r>
- <w:rPr>
<w:b />
<w:bCs />
</w:rPr>
<w:t xml:space="preserve">cow</w:t>
</w:r>
- <w:r>
- <w:rPr>
<w:b />
<w:bCs />
<w:color w:val="FF0000" />
</w:rPr>
<w:t>jumped over the</w:t>
</w:r>
- <w:r>
- <w:rPr>
<w:color w:val="FF0000" />
</w:rPr>
<w:t xml:space="preserve"></w:t>
</w:r>
- <w:r>
<w:t>moon</w:t>
</w:r>
</w:p>

Argghhhhh!!!! Now the same in Power Point:

<a:p>
- <a:r>
<a:rPr lang="es-ES" dirty="0" err="1" smtClean="0" />
<a:t>The</a:t>
</a:r>
- <a:r>
<a:rPr lang="es-ES" dirty="0" smtClean="0" />
<a:t />
</a:r>
- <a:r>
<a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0" />
<a:t>cow</a:t>
</a:r>
- <a:r>
<a:rPr lang="es-ES" b="1" dirty="0" smtClean="0" />
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>jumped</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>over</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>the</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
<a:rPr lang="es-ES" dirty="0" err="1" smtClean="0" />
<a:t>moon</a:t>
</a:r>
<a:endParaRPr lang="es-ES" dirty="0" smtClean="0" />
</a:p>
- <a:p>

The embedded text in the Power Point slide is in a:

<p:sp>
- <p:nvSpPr>
<p:cNvPr id="6" name="5 Rectángulo" />
<p:cNvSpPr />
<p:nvPr />
</p:nvSpPr>
- <p:spPr>
- <a:xfrm>
<a:off x="2963835" y="3244334" />
<a:ext cx="3216330" cy="369332" />
</a:xfrm>

section and is described as follows:

<p:txBody>
- <a:bodyPr wrap="none">
<a:spAutoFit />
</a:bodyPr>
<a:lstStyle />
- <a:p>
- <a:r>
<a:rPr lang="es-ES" dirty="0" err="1" smtClean="0" />
<a:t>The</a:t>
</a:r>
- <a:r>
<a:rPr lang="es-ES" dirty="0" smtClean="0" />
<a:t />
</a:r>
- <a:r>
<a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0" />
<a:t>cow</a:t>
</a:r>
- <a:r>
<a:rPr lang="es-ES" b="1" dirty="0" smtClean="0" />
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>jumped</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>over</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>the</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
<a:rPr lang="es-ES" dirty="0" err="1" smtClean="0" />
<a:t>moon</a:t>
</a:r>
<a:endParaRPr lang="es-ES" dirty="0" />
</a:p>
</p:txBody>


This clarifies the issue with OOXML beyond all possible zealotry, camping, shilling or in any way unsupported IT fanaticism. OOXML is sloppy!! Period. While ODF uses the same tag "<text:span>" to enclose text in all applications, OOXML uses two different tags. OOXML also takes more to represent the same. Notice how each word in OOXML has formatting. Not only is the formatting repeated because OOXML in Office 2007 fails to use styles, it is repeated in every word. Even if the word is exactly like the one next to it.

The string "jumped over the" has the same style and is properly represented in ODF:

Note: the words have been marked in red to show relative space usage.

<text:span text:style-name="T3">jumped over the</text:span>

Word gets it right. A rather lengthly format, but all the text enclosed in it:

- <w:r>
- <w:rPr>
<w:b />
<w:bCs />
<w:color w:val="FF0000" />
</w:rPr>
<w:t>jumped over the</w:t>
</w:r>

With a few more characters, but nothing like Power Point which uses format definition tags for each one of the words. Even when all three words have the same exact format!!!!. Think about what that does to your file size. Even though OOXML files are zipped that is some extra overhead that is totally uncalled for. After all OOXML is not zipped when in your PCs RAM. Thus consuming one of the most expensive components on your computer. Take a look a this:

<a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>jumped</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>over</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>
- <a:rPr lang="es-ES" b="1" dirty="0" err="1" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t>the</a:t>
</a:r>
- <a:r>
- <a:rPr lang="es-ES" dirty="0" smtClean="0">
- <a:solidFill>
<a:srgbClr val="FF0000" />
</a:solidFill>
</a:rPr>
<a:t />
</a:r>
- <a:r>

This is just ridiculous. So if you have a hard time convincing someone that OOXML is bad, just show them this. If they're management ask them if they're willing to pay up for the added implementation cost of handling formats which are not compatible within the same application set (Office 2007). While ODF only requires you to code once with Office you have to code two different ways of opening files and then some more to "copy paste content" between xml storage.