View Issue Details

IDProjectCategoryView StatusLast Update
0004779ardourbugspublic2012-07-13 10:37
Reporterdeva Assigned To 
PrioritynormalSeveritymajorReproducibilityalways
Status newResolutionopen 
Product Version3.0-beta3 
Target Version3.0 
Summary0004779: Latin1 character in imported filename corrupts session file.
DescriptionI imported a file with a latin1 character in it (0xf8). The session worked just fine after the import, but when I tried to reopen it and XML parser error prevented me from doing so.

XML parser error: Input is not proper UTF8, indicate encoding !
Bytes: 0xF8 0x76 0x65 0x72XML parser error: Opening and ending tag mismatch: Regions line 120 and RegionXML parser error: Opening and ending tag mismatch: Session line 2 and RegionsXML parser error: Extra content at the end of the document

I opened the session file in an editor and replaced the latin1 character with a ascii character and similarly renamed the audio file and the problem went away.

A possible solution is to make some sort of encoding on non-UTF8 filenames (eg. base64) or plain simply report the filename as illegal to the user proposing a rename before import.
TagsNo tags attached.

Activities

deva

2012-03-18 09:00

reporter   ~0012962

When looking into this issue it should also be investigated whether xml special characters ( ', ", &, ;, < and > ) in filenames could corrupt the session file.

paul

2012-07-11 20:30

administrator   ~0013862

what was the Latin1 character in this case ?

deva

2012-07-12 05:50

reporter   ~0013863

It was the Danish 'ø' (oslash) as I recall.

paul

2012-07-12 11:25

administrator   ~0013864

I imported a file named: Lä Vôz Dél Ríø.wav into a session with no problems at all.

So, is this still an issue?

deva

2012-07-12 12:02

reporter   ~0013867

I cannot import the file since it states "invalid encoding" in the import dialog.
I think (my best guess) that the reason that you succeded importing the file is that you are running your filesystem in utf-8 mode?

With the import dialog filtering out the invalid filenames, thereby preventing the following session file corruption, I will say that this issue is solved.

However it should be considered to somehow tell the user how to "fix" the badly encoded filenames, since most users are not as adept in encodings as I am ;)

A good 'resolution text' could be along the lines of "Rename the file to not contain any special characters (æ, ä, í, é, etc)to make import possible."

paul

2012-07-12 14:18

administrator   ~0013878

what encoding is the filesystem actually using? what type of filesystem is it?

deva

2012-07-12 14:29

reporter   ~0013879

The filesystem on my system is an ext3 filesystem with filenames encoded in latin1 (iso 8859-1), but most modern linux distros probably use utf-8.

paul

2012-07-12 14:55

administrator   ~0013880

i'm confused. it sounds as though this used to be a problem but that at this point, the file browser in the import dialog filters out invalid names, and thus "hides" the issue. is that a correct understanding?

deva

2012-07-12 15:37

reporter   ~0013881

The files still appear, but they are not selectable and they are marked with the text "invalid encoding".
So I gather that the import dialog is not directly a part of Ardour but rather an extension of the fileselector component in GTK?
In that case the filename(s) returned by the import dialog should be checked for non-utf8 characters before they are inserted into the project.

paul

2012-07-12 15:42

administrator   ~0013882

could i see a screenshot of this ?

2012-07-12 16:17

 

import-dlg.png (27,610 bytes)   
import-dlg.png (27,610 bytes)   

deva

2012-07-12 16:17

reporter   ~0013883

The actual filename in the screenshot is "sløver220.flac"

paul

2012-07-12 18:19

administrator   ~0013884

some gtk developers suggested this doc:

  http://developer.gnome.org/glib/2.33/glib-running.html

and specifically the environment variables

  G_FILENAME_ENCODING

and

  G_BROKEN_FILENAMES

deva

2012-07-12 20:47

reporter   ~0013886

I have now run Ardour with G_BROKEN_FILENAMES=1 and can import the latin1 file without problems.
Saving the project also works, but I still get the error as described in the summary:
----
liblrdf: error - - XML parser error: Input is not proper UTF-8, indicate encoding !
Bytes: 0xF8 0x76 0x65 0x72

paul

2012-07-13 01:57

administrator   ~0013887

can you attach the session file (*.ardour) ?

2012-07-13 10:34

 

dims.ardour (18,487 bytes)

deva

2012-07-13 10:37

reporter   ~0013888

The problem is in lines 42 and 43. The filenames are latin1 encoded even though the xml file parser directive (line 1) states that it is in utf8.
The problem cannot be fixed by encoding the filenames in utf8 since it would require a rename of the latin1 encoded file into its utf8 equivalent.

Issue History

Date Modified Username Field Change
2012-03-17 15:46 deva New Issue
2012-03-17 23:33 cth103 cost => 0.00
2012-03-17 23:33 cth103 Target Version => 3.0 beta4
2012-03-18 09:00 deva Note Added: 0012962
2012-05-23 15:08 cth103 Target Version 3.0 beta4 => 3.0
2012-07-11 20:30 paul Note Added: 0013862
2012-07-12 05:50 deva Note Added: 0013863
2012-07-12 11:25 paul Note Added: 0013864
2012-07-12 12:02 deva Note Added: 0013867
2012-07-12 14:18 paul Note Added: 0013878
2012-07-12 14:29 deva Note Added: 0013879
2012-07-12 14:55 paul Note Added: 0013880
2012-07-12 15:37 deva Note Added: 0013881
2012-07-12 15:42 paul Note Added: 0013882
2012-07-12 16:17 deva File Added: import-dlg.png
2012-07-12 16:17 deva Note Added: 0013883
2012-07-12 18:19 paul Note Added: 0013884
2012-07-12 20:47 deva Note Added: 0013886
2012-07-13 01:57 paul Note Added: 0013887
2012-07-13 10:34 deva File Added: dims.ardour
2012-07-13 10:37 deva Note Added: 0013888