UTF8 BOM mark crashes gb.form.editor

Ask about the individual Gambas components here.
Post Reply
sergioabreu
Regular
Posts: 117
Joined: Tue Jul 09, 2024 9:27 am

UTF8 BOM mark crashes gb.form.editor

Post by sergioabreu »

Hello

UTF-8 encoding supports the add of a BOM mark at the firts 3 bytes of an UTF-8 file

These chars are \xEF\xBB\xBF and it is a "signature" that the document is formally UTF-8.

Although this BOM mark is optional in UTF-8, some documents may have it, so I think that it should be treated by Mr Benoir because if a document contais it, it will crash gb.form.editor. The crash message says it can't render the "image". I am not sure, but it seems that gambas tries to convert BOM mark in a utf-8 visible symbol.

So the solution is to SKIP these 3 bytes if they are present at the beginning of a document.

Code: Select all

'Suppose you got data from the file... in a variable called data:
If Left(data, 3) == "\xEF\xBB\xBF" Then
  data = Mid( data, 4) ' Skips the BOM Mark, making the data safe for gambas
EndIf
'From here the data will be "clean"
This post is kind a hidden bug report. I am NOT a critic of gambas at all, totally the opposite: I am an enthusiast of it and wanna help to make gambas better and better.

Regards.

Sergio Abreu - Brazil
SiteDir
Site Director
Posts: 15
Joined: Sun Apr 06, 2025 11:00 pm

Re: UTF8 BOM mark crashes gb.form.editor

Post by SiteDir »

sergioabreu wrote: Wed May 07, 2025 11:11 am Hello

UTF-8 encoding supports the add of a BOM mark at the firts 3 bytes of an UTF-8 file

These chars are \xEF\xBB\xBF and it is a "signature" that the document is formally UTF-8.

Although this BOM mark is optional in UTF-8, some documents may have it, so I think that it should be treated by Mr Benoir because if a document contais it, it will crash gb.form.editor. The crash message says it can't render the "image". I am not sure, but it seems that gambas tries to convert BOM mark in a utf-8 visible symbol.

So the solution is to SKIP these 3 bytes if they are present at the beginning of a document.

Code: Select all

'Suppose you got data from the file... in a variable called data:
If Left(data, 3) == "\xEF\xBB\xBF" Then
  data = Mid( data, 4) ' Skips the BOM Mark, making the data safe for gambas
EndIf
'From here the data will be "clean"
This post is kind a hidden bug report. I am NOT a critic of gambas at all, totally the opposite: I am an enthusiast of it and wanna help to make gambas better and better.

Regards.

Sergio Abreu - Brazil
Thanks for the heads up! That information might help out someone.

If you think it's a bug, then you need to report it on the Gambas bug tracker. Reporting there will be the only way it can get fixed.
The Gambas One Administrative Team

- Gambas One is not the Gambas bug tracker.
- We can help you determine if an IDE bug, or a coding error. If a bug, you are the one to report it.

To report bugs in the Gambas IDE:
Official Gambas Bug Tracker
User avatar
BruceSteers
Legend
Posts: 2145
Joined: Thu Jul 23, 2020 5:20 pm
Location: Isle of Wight

Re: UTF8 BOM mark crashes gb.form.editor

Post by BruceSteers »

TextEditor has now been made to handle the useless UTF-8 BOM
https://gitlab.com/gambas/gambas/-/comm ... 7757f8744d

Benoit said this...
Benoit Minisini wrote: I don't get an error with a file starting with UTF-8 BOM, just an
invisible character at the beginning of the first line.

Note that BOM is a Windows thing created by moronic developers that did
not understand UTF-8. BOM is useless in UTF-8, as there is no byte order
in UTF-8.
Post Reply