Hello everyone
I have been doing a project and I stumbled across a problem; string won't trim properly. As you can see in the pictures I want to get only the numbers from the string. First picture is from the startup of the program(where trimming works fine) and the last picture is where I actually need to get the numbers from.
https://i.sstatic.net/4J5OM.jpg (Can't post pictures because I don't have 10 rep)
And in code
HtmlElementCollection TD = b[i].GetElementsByTagName("td");
string FirstString = TD[1].InnerText; //which is "??(?131?|?26?)?? "
Console.WriteLine("2. FirstString: " + FirstString);
string[] SecondString = FirstString.Trim('?', ')', '(', ' ').Split('|');
Console.WriteLine("SecondString1 " + SecondString[0].Trim('?'));
Console.WriteLine("SecondString2 " + SecondString[1].Trim('?'));
And below is the warning that I get in visual studio
CropFinder.exe (CLR v4.0.30319: CropFinder.exe): Loaded C:\Windows\assembly\GAC\Microsoft.mshtml\7.0.3300.0__b03f5f7f11d50a3a\Microsoft.mshtml.dll'. Module was built without symbols.
Thank you for your help in advance, Erik
The characters you are receiving from HTML are very likely not actually ?
characters, but some characters which cannot be displayed in console output properly so ?
is displayed instead.
To see exactly what characters you are actually receiving, so that you are able to modify your code accordingly, enumerate over them and output their codes:
foreach ( char character in FirstString )
{
Console.WriteLine( (byte)character );
}
You will if you compare the output from your custom string and from HTML, you will probably see the character codes differ. You can then do trimming based on the code:
FirstString.Trim( ( char )characterCode );
Where characterCode
is the character code from the output.
As an alternative solution to the trimming, you should consider extracting the number from the result:
static void Main(string[] args)
{
string FirstString = "??(?131?|?26?)??";
var parts = FirstString.Split('|');
Console.WriteLine(ExtractNumber(parts[0]));
Console.WriteLine(ExtractNumber(parts[1]));
Console.ReadLine();
Console.ReadLine();
}
private static int ExtractNumber(string text)
{
var numberString = String.Join("", text.Where(Char.IsNumber));
int result = 0;
int.TryParse(numberString, out result);
return result;
}
I am using Where
LINQ extension method to select only numeric characters from the input. Then I am using Join
to turn the array back to string (using empty string as the separator). Finally I am doing a int.TryParse
to attempt to convert the resulting number to int
.
If you are using C# 7, you can simplify the code a bit more:
private static int ExtractNumber(string text)
{
var numberString = String.Join("", text.Where(Char.IsNumber));
int.TryParse(numberString, out var result);
return result;
}
out
variables can be declared inline in C# 7.