Duplicate entries when using pagination

NomadDev · January 24, 2022, 3:27am

Hi, I think the issue I just posted is likely related. I am putting here so both can be solved at the same time.

YosephKS · January 24, 2022, 3:30am

Yeah looks like same issue, just saw recently your thread, we’re working on fixing it

Simon · March 22, 2022, 10:46pm

It’s been almost 2 months. Any update on this? You’d think a company focused on data would prioritize such a glaring issue as a broken pagination system. You can imagine how many people are getting duplicates or inaccurate data and have no idea why and how many other projects are at a standstill until this issue is addressed.

cryptokid · March 23, 2022, 11:02am

we are working on a new backend infrastructure now

cryptokid · March 23, 2022, 12:07pm

did you try to use the cursor?

Simon · April 3, 2022, 11:53pm

Yeah, I tried the “cursor” that you guys put in after taking away the “sort by” option and it didn’t work. The system for pulling data via API is fundamentally broken and it doesn’t seem to be much of a priority for you guys. As a data service provider, I would think this would gain much more attention. Virtually every project being built utilizing your API service will not work correctly or give false or missing data. Are you aware of the how this will impact new devs jumping into your ecosystem? There’s no disclaimer, no email notice, nothing. Plenty of moralis emails and youtube videos on using the javascript calls though. Could this be because your model relies on monetizing devs per call? I mean, an API call would allow people to actually retrieve, store and manipulate data on their servers so I could see how this is directly opposed to your pay. Sorry, I guess I’m just really at a loss for how such a gaping flaw in your system just sits in place, frustrating people that don’t spend the 3 weeks (as I did) writing tests (remotely) to pinpoint how your system is failing.

Please forward this to your highest dev or manager. I feel like your responses are just stopgap replies. …did you try the cursor? Seriously?

alex · April 4, 2022, 12:49am

Is this only an issue with this endpoint or contract? I haven’t noticed any duplicates in my queries but I’m not working with anywhere near the same amount of data.

Simon · April 4, 2022, 2:03am

Start at the top of this thread

cryptokid · April 4, 2022, 6:58am

I forwarded this problem to the team, with that particular contract with ~100k token ids

cryptokid · April 6, 2022, 11:44am

I tested with cursor now and it seems to work fine:

import requests
import time

ids = {}
def get_nft_owners(offset, cursor):
    print("offset", offset)
    url = 'https://deep-index.moralis.io/api/v2/nft/0x50f5474724e0ee42d9a4e711ccfb275809fd6d4a?chain=eth&format=decimal'
    if cursor:
      url = url + "&cursor=%s" % cursor

    print("api_url", url)
    headers = {
        "Content-Type": "application/json",
        "X-API-Key": "API_KEY"
    }
    statusResponse = requests.request("GET", url, headers=headers)
    data = statusResponse.json()
    try:
        print("nr results", len(data['result']))
        for x in data['result']:
            ids[int(x['token_id'])] = 1
    except:
        print(repr(data))
        print("exiting")
        raise SystemExit

    cursor = data['cursor']
    print(data['page'], data['total'])
    return cursor


cursor = None
for j in range(0, 211):
    print("nr unique token_ids at offset", j*500, "=>", len(ids))
    cursor = get_nft_owners(j*500, cursor)
    print()
    time.sleep(1.1)


print("nr unique token_ids", len(ids))

Simon · April 29, 2022, 4:57pm

There still seems to be an issue with pagination (or general data queries). I wrote a script to call the “{address}/nft/transfers” endpoint, store the results and store the cursor and repeat until my local db has the same “total” as the moralis endpoint shows. This works (most of the time) but as I try to call the endpoint (hourly) to look for new entries, they don’t show in the results

When I initially ran the script it showed a final total of 333050 entries. I stored the cursor and setup the hourly cron. Within a day, the remote total was higher than my local total but the api results are empty using the last given cursor. See below. You can test this here using the cursor and endpoint below:

https://deep-index.moralis.io/api-docs/#/account/getNFTTransfers

The last cursor received:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJvcmRlciI6IkRFU0MiLCJvZmZzZXQiOjMzMzUwMCwibGltaXQiOjUwMCwidG9rZW5fYWRkcmVzcyI6IjB4NTBmNTQ3NDcyNGUwZWU0MmQ5YTRlNzExY2NmYjI3NTgwOWZkNmQ0YSIsInBhZ2UiOjY2Nywid2hlcmUiOnt9LCJrZXkiOiI5MDQ4OTM2LjEzNS4xMzUuMCIsImlhdCI6MTY1MTE2NDQxOX0.oJ4qlZ4vejZPF9gfuT58Z1vjH1jy_jfGhCjm_apn0fg

The Contract Address:
0x50f5474724e0ee42d9a4e711ccfb275809fd6d4a

array:6 [
“total” => 334981
“page” => 667
“page_size” => 500
“cursor” => “”
“result” => []
“block_exists” => true
]

You can see that it says page 667. We know that the max result set is 500 so 667 x 500 = 333500 but the total shown (above) is 334981.

Something appears to be wrong (still) with the moralis api cursor system.

cryptokid · April 29, 2022, 8:13pm

you don’t have to store the cursor, you will get a new cursor every time you start the query again without a cursor

Simon · April 29, 2022, 10:48pm

So when I get to the end and there are no more responses I can call the endpoint again without a cursor and it will reply with any new results?

alex · April 30, 2022, 12:21am

Cursor is used to get the next page of results. Omitting it will start your query from the beginning or the first page.

It won’t give you new results compared to your previous queries, you are just calling it again which may or may not have different results or data.

Simon · April 30, 2022, 12:28am

Right. That’s what I thought. I’m trying to get the newest entries. So, I get all the entries, an hour goes by and new entries are added (on the moralis side) and I want to get those new entries.

alex · April 30, 2022, 1:06am

You could call it again and compare the results, but this is obviously quite heavy. I think that’s your only option at this stage. Do they need to specifically be new entries or could you just replace your old data?

What chain are you using for that getNFTTransfers example you mentioned (to look at the total/page_size issue)?

cryptokid · April 30, 2022, 8:20am

yes, that is how you should use the cursor, but it will not reply only with new results, it will iterate again on all the results

Simon · April 30, 2022, 3:06pm

I see. This isn’t going to scale if every client needs to pull the entire data set in order to get the new entries. A delta mechanism can be implemented that will still allow the Moralis servers to reduce load. I created a diagram, please take a look. I’m open to discuss with your systems architect(s). Please pass this along.

cryptokid · April 30, 2022, 3:16pm

Can you post this on roadmap.moralis.io?

Simon · April 30, 2022, 4:55pm

Yeah, I posted it there.
Honestly, I don’t know how people will derive much use out of an API system that can’t get just new entries. I mean, I’m sure you guys aren’t pulling the entire blockchain into your database each time you sync. Can you imagine the processing that would take? That’s what your asking your clients to do.