Problems with OOXML import/export of SmartArt

Hi everyone,

in released versions, not the diagram itself is imported but the replacement drawing. In a pptx presentation it is in ppt/diagrams/drawing1.xml, for example.
If you set the environment variable DIAGRAM_IGNORE_EXTDRAWINGS=1, then this fallback is ignored and the true diagram definition is used. You will notice, the result is so poor that it is unacceptable to the user.

After bringing the information from file to our internal classes, the layout of the diagram is calculated in oox/source/drawingml/diagram/diagramlayoutatoms.cxx.

I have examined the current solution and see a lot of ToDo: rules are not read, constraints are only partly read, orientations are wrong, color transitions are missing, for example. However, I see a principle problem:

MS Office determines the size of the shapes of the diagram from the font size and the text contained in the shapes. In general, if the text content of a shape becomes larger, then the shape becomes wider. The layout algorithms have constraints and rules to change the shapes to the purpose of the layout algorithm and to change the shapes so, that the diagram fits into the given diagram area. Such includes reducing the font size, for example.

But at the time performing the layout in diagramlayoutatoms.cxx, the shapes are not yet inserted in the page. I think, thus a shape cannot adapt its size to its text content. Is that correct?

Do you agree with my diagnosis? If yes, how to solve the problem?

Has anyone time and interest to discuss the problems with me?

Kind regard,

Regina

Hi Regina, thanks for asking - obviously its by far the best if the engineer working on this responds here - which is Armin :slight_smile: I’ll encourage him to get back to you. All the best !

Hi Regina,

yes, there is a lot to-do. That is what I am communicating all the time, e.g. see https://forum.collaboraonline.com/t/ttt-today-by-armin-le-grand-smartart-demo-current-state-of-affairs/4545. We are at improving SmartArt in general - we are not at the beginning, but far from being done, too.

As I explained in that TTT and also in my last COOL-Days talk there are three big tasks to get to a better, working SmartArt implementation:

(a) the Model part
(b) the UI part
(c) the ReLayout/Algorithm part

Especially part (a) and (c) are independent, so I what I have up to now concentrated on is (a): It is needed in any case. It‘s about ooxml/odf exchange/roundtrips/internal model representation, including embedding to Group, SubSelection, attribute changes, automatic handling of orig DomTrees, re-creation of data/drawing.xml, triggring of add/remove/relayout/ungroup, needed slots, etc, etc.
This makes good progress, not yet complete, but getting close. We still need to discuss ODF FileFormat, and ooxml acceptance by re-loading to mso needs to be improved.

There is also some stuff for (b), but not yet too much. Visualisation as SmartArt by using a frame as feedback, possibilities for triggering that small old dialog (which will not be there at the end), SubSelection & traveling, AttributeChanges, TextChanges, DirectTextChange, etc. Of course much nicer/better stuff will be possible, including gallery, etc, etc.

I did not yet touch (c) which will also be a lot of work, esp. because as typically not documented by inventor how to execute that. If that would be documented it would never have been necessary to add drawing.xml at all to ooxml after ca. two years. I cannot estimate too much about it, but it will be mainly systematically find out how to execute that algorithm (reverse-engineer unfortunately). I always warned that that part will be hard, but the person originally involved doing it ensured me that it is doable (and he will eventually do that or try - we will see). So I am confident that we will do progress there.

Of course any help especially with (c) is highly appreciated - if you can dig into it and figure out basic functionalities that would be phantasitc! I plan to get (a) complete and then drive (c) forward as good as I can. Until now there are no changes to (c), it is exactly in the form it was when I first looked at it.

So for that text stuff and sizes: I know there is some calculation already on place now (look for it in oox part). The basic problem is that we will not get away from oox::object usage which originates from ooxml import, but is used in the re-layout: the oox::object is like a bit-bucket, all data from ooxml import is thrown at it and then in a huge method a XShape is created from that. Since SmartArt layout is close to ooxml stuff that was used for re-layout, too: all data from re-layout is collected there and XShape created. The valuable part is that creation, grown over years and hard to isolate, makes no sende to try to create XSapes directly. I currently do not yet know how Text-size stuff will and needs ti be handled, so cannot really answer that, but there will be XSapes only at the very end of the re-layout - for now.

All in all: This is work in progress, in no way complete. As I said in my COOL-Days talk (pls look at the infos on the short shown slides) i would guess SmartArt is done somwehere between 1/3rd to 2/3rds,maybe 50%.

HTH!

1 Like

Hi @alalg ,

thank you for taking the time to look at my post.

Unfortunately, the video is not yet available. Perhaps you can sent me your slides and perhaps those from your COOL-Days talk as well?

I have already learned a lot about SmartArt in OOXML since your TTT talk. So I can likely help.

I’ve created a set of all predefined SmartArt types, have created examples created from .glox files and have changed existing examples so, that I can test in PowerPoint the result of changes in the xml source. I know in principle how constraints and rules work. And I know already, that not all needed information is imported, e.g. attribute meth is missing, constraints without forName attribute are missing, and rules are not imported. And I know already errors in current layout and style calculations. Is there a suitable place to collect the problems, other than writing a lot of bug reports although it is work in progress? Perhaps in shared documents? Is there something similar to https://nextcloud.documentfoundation.org for Collabora? Or use a page under User:Regina in the TDF Wiki? Or shared documents on Google Docs? Any suggestion?

I’ll do my best to help you solve (c). I suspect that in order to get a correct SmartArt, we need to re-layout it after it has been inserted into the page, when importing the OOXML file.

Hi Regina,

I’ve created a set of all predefined SmartArt types, have created
examples created from .glox files and have changed existing examples so,
that I can test in PowerPoint the result of changes in the xml source. I
know in principle how constraints and rules work.

Really glad Armin is back from vacation here :slight_smile: and it sounds like you’re making good progress.

One thing I was wondering: something we can perhaps verify. We have a

very large corpus of test OOXML documents; I wonder if it would be worth
writing a python script, and running it over all of these to dig out and
enumerate the constraint XML behind the smart-arts across all of them.

My money would be on 99+% of these constraints being un-modified copies

of the original built-in pre-defined SmartArts in MS Office - but it
would be good to know that for ~sure.

If that is the case, we could dramatically subset our testing and

development to focus first on the most duplicated ones of those (for
different input bullets), and then on ensuring ‘correctness’ for the
predefined types’ constraints, and then of course for our own gallery
variants as/when.

Might be useful to have that data; as/when you up-load your test files

if you’ve not got to it, I’d be interested in helping writing a script
to dig out that data - to try to dramatically subset and simplify the
search space :slight_smile:

Good stuff,

Michael.

I think it would not help in current state. Armin and I want to talk and discuss what to do next. He’ll get in touch with me once he’s back at work.

I was mistaken; Armin will be away for a few more weeks; meanwhile -
perhaps Quikee can help with the code pointers & specific ?

Sorry!

Michael

I’m sorry to hear that. I wish him a speedy recovery.

Unfortunately it is not simply about code pointers, but it needs decisions about the final implementation and a roadmap how to achieve it step by step.

There was initial work in that area by Grzegorz Araminowicz in a GSoC project 2017 and later Miklos Vajna has done a lot on import of SmartArt. However, if they were interested, they would have replied here.

I’m currently collecting all the problems I have noticed so far. It will take some time. I’ll notify you, when that is finished. We can discuss how to proceed then.

If Quikee is interested, he could mail me and I can share the document. It is currently on the TDF Nextcloud.

Hi @Regina

There is a COOL Technical Committee meeting every Wednesday, and this topic might be a good fit for discussion there. Quikee and Miklos are usually present as well, so it could be a good opportunity to get everyone together and discuss possible directions and next steps.

Would you be able to join tomorrow?

Every Wednesday
~16:00 CEST
https://meet.jit.si/COOL-TC-Meeting

More details: https://www.collaboraoffice.org/post/communicate/#cool-technical-committee-meeting

Hi Regina,

I can help with code pointers, but I have not actively followed what Armin is doing and what the current state of the feature is and what is the roadmap to get that done. I’m sure there are others that at least has a general overview what was done. I can look at the document when you are done and then we can decide how to proceed.

I’ll look into getting you access to our servers so you can write the document there.

Tomaž

That has been successful. I have moved the documents. The document with the collection of problems is in ProblemsInSmartArtImplementation.odt - Nextcloud

And there is an example presentation made with PowerPoint, that I have mentioned in the document. Basic_Process.pptx - Nextcloud