[MPlayer-dev-eng] ctype.h functions on win32

Gianluigi Tiesi mplayer at netfarm.it
Sat Oct 13 00:51:02 CEST 2007


On Fri, Oct 12, 2007 at 05:33:13PM -0400, Rich Felker wrote:
> On Fri, Oct 12, 2007 at 09:57:59PM +0200, Gianluigi Tiesi wrote:
> > On Fri, Oct 12, 2007 at 01:03:03PM -0400, Rich Felker wrote:
> > > On Thu, Oct 11, 2007 at 07:03:23AM +0200, Gianluigi Tiesi wrote:
> > > > Win32 crt's ctype functions like isalpha isspace etc
> > > > are declared as int, but they return wrong results
> > > > (using msvc 2005, you get an assert or crash)
> > > > if the value is < 0
> > > 
> > > It's unclear from the spec what they're supposed to do. On the one
> > > hand it says the application shall ensure that the value passed is a
> > > valid value for unsigned char. But on the other hand, it says the
> > > functions shall return a nonzero value if the argument is a character
> > > of the corresponding type, and 0 otherwise (which could be interpreted
> > > as meaning that 0 should be returned for invalid values as well).
> > > 
> > > In any case, why are we using these functions? They're bogus since
> > > they only work for 8bit characters and not multibyte characters, and
> > > MPlayer does not even use the locale anyway due to various issues, so
> > > we might as well just code the ascii tests directly if ascii is all we
> > > need...
> > 
> > the proto of .h is int, so it should really work also with negative
> > values.
> 
> And the prototype for strlen is char *, so it should work with any
> possible value a char * can take on??? No, of course not! The
> documentation for the function specifies what values are valid.
> 

yes at least linux man pages says this but non ascii chars
as signed char are negative values while the functions
needs an unsigned char OR -1

> > there are strange results with different locales, ie if LANG
> > env var is turkish islower and/or tolwer works in a different way
> > 
> > suggestions?
> 
> MPlayer does not use locale aside from determining the character
> encoding. BTW this hilights a rather serious bug in that MPlayer's use
> of translations is invalid. The translated strings passed to printf
> are generally not valid multibyte character strings in the current
> locale, since the locale remains C. Thus, printf should not
> necessarily accept them.
> 
> IMO we should use LC_CTYPE (but not any other locale classes) and just
> remove all use of case-mapping and character-class functions and
> replace them with ascii-only macros.

from the man:
These functions check whether c, which must have the value of an
unsigned char or EOF, falls into a certain character class according to
the current locale.

"according to the current locale"
this is not so good, as currently used in mplayer

> > > > By looking at the code, there are some functions
> > > > where it's enough to change char to unsigned char in arguments
> > > > 
> > > > other functions are more difficult since they use int
> > > > since some api are still returning int like fgetc (deprecated iirc)
> > > 
> > > Huh?? Why would fgetc be deprecated? Returning int is correct for
> > > fgetc. It returns a byte (unsigned char) or EOF (-1) if end of file is
> > > reached or another error occurs. What type would you suggest it should
> > > return to meet these needs?
> > 
> > so I don't remember correct  :) the problem still exists with the value
> > so it may need to be checked if -1 then casted to unsigned in isspace
> > call
> 
> -1 (EOF) is also valid for passing to isspace and friends, iirc..
> 
> > > > in these places the value can be casted to uint32_t
> > > 
> > > Absolutely not.
> > 
> > I mean in the isspace call not elsewhere
> 
> That would be even worse. -1 is (IIRC) a valid argument to isspace but
> 2^32 - 1 definitely is not valid!!

hmm uint8_t is better :D

Mainly problems are in subtitle stuff, I wonder if subreader.c
it's also used when using ass.
If yes I think it will need a rewrite or partial rewrite.
it's all based on isXXX functions, that imho should be avoided

and please don't turn each email of this ml into a flame
or a dispute about api calls meaning
I'm trying to help and I'm open to your prospectives

Regards

-- 
Gianluigi Tiesi <sherpya at netfarm.it>
EDP Project Leader
Netfarm S.r.l. - http://www.netfarm.it/
Free Software: http://oss.netfarm.it/



More information about the MPlayer-dev-eng mailing list