Page 1 of 2

Homework question

Posted: Thursday 26th January 2023 4:51pm
by twbro54l
Im trying to build a horoscope app for school which reads the data from an URL. The question is: How can I read contents of a URL page and get only the data I want? Like the string I want is after "<meta property="og:description" content=xxxxxxxxxxx"" .
I'm not interested in being spoon feed but I really need some help!
Thanks :P

Re: Homework question

Posted: Thursday 26th January 2023 5:43pm
by cogier
Welcome to the forum.

Option 1 would be to sign up for an API, that way you could easily get just the data you want. Have a look here
Option 2. As far as I can see, you can read all the page, not part of it. However, once you have the page, you can use Gambas to find the part you want.

Here is some code that will download the whole page of a website and save it in your program's folder. You can then search the file for the text position and grab what you need.

Does that help? No spoon-feeding intended! :D

''NEEDS gb.net.curl

Public Sub Form_Open()

  Dim sPage As String

  sPage = GetHScope("https://www.elle.com/horoscopes/daily/a98/taurus-daily-horoscope/")

  File.Save(Application.Path &/ "hScope.txt", sPage)

End

Public Sub GetHScope(sURL As String) As String

  Dim hClient As HttpClient                                                     'To create a HTTP Client
  Dim sResult As String                                                         'To store the word's meaning

  hClient = New HttpClient As "hClient"                                         'Create a HTTP Client
  With hClient                                                                  'With the Client..
    .URL = sURL                                                                 'Set up the URL
    .Async = False                                                              'No Asynchronous transmission?
    .TimeOut = 60                                                               'Don't hang around waiting for more than 60 seconds
    .get                                                                        'Get the data
  End With

  If Lof(hClient) Then sResult = Read #hClient, Lof(hClient)                    'When all the data is downloaded store it in sResult
  Return sResult

End

Re: Homework question

Posted: Thursday 26th January 2023 10:55pm
by twbro54l
It does help! Do you mind if I ask you how can I get the string I want? : :D

Re: Homework question

Posted: Thursday 26th January 2023 11:34pm
by thatbruce
Its returned by that function, ...

in Main()
sPage = ...

Re: Homework question

Posted: Friday 27th January 2023 8:23am
by twbro54l
Yes, but it also returns "garbage" which I don't use

Re: Homework question

Posted: Friday 27th January 2023 8:26am
by cogier
twbro54l wrote: Thursday 26th January 2023 10:55pm It does help! Do you mind if I ask you how can I get the string I want? : :D
If you mean how to get the correct part out of the saved data then let me know which site you are wanting to use.

Re: Homework question

Posted: Friday 27th January 2023 8:47am
by twbro54l
Here's the site! Sorry!
https://www.horoscop.ro/horoscop-berbec/
I can get them in english too, but I will be reading the data based on the date they want. Like for example yesterday, one week ago or one month ago.
Here's what I could get from the source of the page, where the text is located at.

Code: Select all

<div class="zodie-content-texts black">
									<p>Dimineata promite a fii linistita. Interactioneaza cu cei care iti plac si nu incerca sa iei noi initiative. Ofertele minore se pot dovedi de succes. O serie de sarcini materiale actuale pot fi realizate cu succes la fel de bine.</p>
                </div>

Re: Homework question

Posted: Friday 27th January 2023 10:49am
by BruceSteers
twbro54l wrote: Thursday 26th January 2023 4:51pm Im trying to build a horoscope app for school which reads the data from an URL. The question is: How can I read contents of a URL page and get only the data I want? Like the string I want is after "<meta property="og:description" content=xxxxxxxxxxx"" .
I'm not interested in being spoon feed but I really need some help!
Thanks :P
Please explain exactly what you need better. and what your experience with gambas is.

You say you do not want to be spoon-fed but it also sounds like you are very new to gambas and have no idea where to begin.
If that's the case then don't worry about the spoon-feeding ;)

you will want to use String functions/operators. have you read the wiki?
http://gambaswiki.org/wiki/cat/string
http://gambaswiki.org/wiki/cat/stringop

Dim aLines As String[]
Dim iCount As Integer

aLines = Split(sPage, "\n")  ' split the page into lines.

For iCount = 0 To aLines.Max

' Use the Like keyword to check a string pattern
 If aLines[iCount] Like "<meta property=\"og:description\" content=*" Then
   Print "Found Line";; aLines[iCount] 
   Print "The next line is "; aLines[iCount + 1]
   Break   ' exit the loop as we are finished
 Endif

Next



If the meta line you're seeking is in the header you may be able to use the Headers array http://gambaswiki.org/wiki/comp/gb.net. ... nt/headers
Edit: forget that i tried and the meta properties cannot be found in the headers

Or you could modify Cogiers method to do it as you read the page. this will be faster as the whole page will not need to be downloaded just the first few lines..
(See the next message)

Re: Homework question

Posted: Friday 27th January 2023 11:13am
by BruceSteers
This will extract the contents of the meta tag you are seeking.


Public Sub Form_Open()

  Dim hClient As HttpClient                                                     'To create a HTTP Client
  Dim sResult As String                                                         'To store the word's meaning
  
  hClient = New HttpClient As "hClient"                                         'Create a HTTP Client
  With hClient                                                                  'With the Client..
    .URL = "https://www.horoscop.ro/rac/"                                                                 'Set up the URL
    .Async = False                                                              'No Asynchronous transmission?
    .TimeOut = 60                                                               'Don't hang around waiting for more than 60 seconds
    .get                                                                        'Get the data
  End With
  
  While Lof(hClient)    ' Lof has a value until all the file has been downloaded

    sResult = hClient.ReadLine()  ' Read 1 single line

   If sResult Like "<meta property=\"og:description\" content=*" Then      ' Here the line is found
     sResult = Mid(sResult, InStr(sResult, "content=") + 9)   ' trim out the left side
     sResult = Mid(sResult, 1, RInStr(sResult, "\"") - 1)        ' trim out the right side
    hClient.Close    ' close the stream     
    Break   ' exit the loop
  Endif
  Wend

Print sResult
 
End



Prints...
Un castig financiar ar putea veni ca rezultat al unei miscari pe care nimeni nu se astepta sa o faci. Este posibil ca toata lumea sa fie foarte mandra de tine - si la fel vei fi si tu. Acesta este doa

Re: Homework question

Posted: Friday 27th January 2023 1:29pm
by twbro54l
Thank you so much! Yes, I'm new to Gambas. I'm going to add the born-date, month and current-date (or the date they want the data for) in a Form and print it in a text area.
Ill post updates when I'm done. Thank you guys so much!