Delete by query

I have a large amount of objects i want to remove, but not all entries.

I can delete 1 by 1 on a loop, but its painfully slow (there is approx 130,000 objects to remove)

Ideally if there was some kind of DeleteByQuery option that would be great.

You can connect directly to mongo and it should be faster

Is there a faster way to do this with the API yet?

My entire codebase revolves around using the API, rather than connecting to mongo. Iā€™m not even passing mongo credentials to the app. :sob:

Something like query.lessThan('block_timestamp', myDate).delete() would be great.

Also, more detailed BulkDelete examples would be nice. Filtering on greater/less than isnā€™t initially clear.

What does your current delete code look like?

Have you used bulkDeleteMany in a cloud function?

Well, Iā€™m trying to do something like this:

let swapsToDelete = [{ filter: { block_timestamp: { lessThan: relevantDate.toUTCString() } } }];
return Moralis.bulkDeleteMany(swapTable, swapsToDelete);

I receive the following error: You cannot use [object Object] as a query parameter.

Itā€™s not clear from the docs how you filter on greater than and less than. Or, Iā€™m simply missing it.

Any update here? It would really be helpful to get some ā€œpainfully detailedā€ examples of advanced filters for bulk deletes. I would imagine that gt/lt would be fairly commonly expected comparisons.

What is the value of swapTable? Iā€™m not sure if you can add additional constraints in this way but see if you can get past the error.

Looks like you can do it a different way using destroyAll, tested with about 5 rows. Iā€™m not sure how well it will do with larger amounts e.g. in the 100s/1000s.

Cloud function (or run it with the SDK in your app):

Moralis.Cloud.define("deleteMany", async (request) => {
  const query = new Moralis.Query("Ages");
  query.greaterThan("age", 3);
  query.find().then(Moralis.Object.destroyAll);
});

Hrm, I didnā€™t realize it was ok to use that for massive deletes (hundreds of thousands and millions of records).

Moralis.Cloud.define('purgeSwapTable', async (request) => {
  let relevantDate = new Date();
  relevantDate.setHours(relevantDate.getHours() - 2);

  const swapTable = request.params.swapTable;

  const logger = Moralis.Cloud.getLogger();

  const query = new Moralis.Query(swapTable);
  query.lessThan('block_timestamp', relevantDate);

  query
    .find()
    .then(Moralis.Object.destroyAll)
    .catch((error) => {
      logger.error('ERROR FINDING ITEMS: ' + error.code + ' : ' + error.message);
    });
});

This doesnā€™t work. Iā€™m dynamically passing the swapTable and Iā€™ve verified that it passes the correct class names. Iā€™ve also tried multiple Date formats.

Maybe an issue with the query constraint for block_timestamp. Does the query actually work in general (not trying to delete anything but if the query works with that lessThan and gets a result(s)).

hundreds of thousands and millions of records).

Yes possibly wonā€™t work then. destroyAll can empty a class completely with no other constraints set so you can test it on one of these classes to see if it can empty it.

This doesnā€™t work. Iā€™m dynamically passing the swapTable and Iā€™ve verified that it passes the correct class names

Iā€™m not sure what you mean, you can only be using one class name at a time for each query. What is an example of the value of swapTable?

I have a similar local query (using my local Moralis instance) that works for finding items via block_timestamp and it works. But nothing in the cloud code seems to work or return ā€œquerywiseā€

But nothing in the cloud code seems to work or return ā€œquerywiseā€

Are you getting a server error log from your catch? Thatā€™s fine, you just need to check your server dashboard if anything was deleted.

Tables are currently named like so: UNISWAPvTWOETH, UNISWAPvTWOMATIC, ā€¦

Itā€™s definitely finding them. I can output the count when I use .limit(). I can also loop through the array of results. But no ā€œactionā€ on the results has any . . . well . . . result. All table sizes remain the same, or continue to grow as inserts happen.

Iā€™ve tried .then() off the promise using Moralis.Object with no errors.

Iā€™ve tried const result = await ... and used res.destroyAll with no errors.

Iā€™m finding anything beyond a basic ā€œlistingā€ cloud function frustratingly unintuitive. Would be nice to get a full spec on the Moralis.Cloud object? Other than size, itā€™s not clear to me why calling destroyAll doesnā€™t work, or why itā€™s not a ā€œproperā€ function call, or . . . so many other things.

Iā€™ll see if I can just batch a background local async() process since it appears local queries work.

What is an example of an event sync youā€™re using to get this sort of large amount of data into a class? I would like to test on my end.

Moralis uses Parse Server so you can look up documentation/info on it that way that Moralisā€™s docs donā€™t seem to cover.

Literally using coreservices_addEventSync to watch all of BSC/ETH/AVAX/MATIC (in 4 separate tables) for Swap events. I was using Speedy Nodes for this before Moralis shut them down. Now Iā€™m trying to use sync events with cloud functions to emulate that and simply pull results every couple of minutes. However, itā€™s necessary to purge the data regularly or the tables become extremely delayed.

Literally using coreservices_addEventSync to watch all of BSC/ETH/AVAX/MATIC (in 4 separate tables) for Swap events

An exact example(s)/code with your parameters would help.

A detail I missed was setting the limit which also applies to destroyAll:

query.limit(10000000);

Before it was just using the default of 100.

This seems to work ok on my end with a lot of results - but it may take some time before itā€™s finished. It may make sense to have an upgraded server to handle all of these tasks.

Yeah, I set limits to something manageable, but still discernible (e.g. 5000), and nothing changed. When I went to anything above 15000 (triggering over HTTP) I simply received 500 errors.

Yeah, maybe once we hit production. But for now, I have no interest in paying $300/mth to be able to delete large numbers of entries. šŸ« 

Yeah, I set limits to something manageable, but still discernible (e.g. 5000), and nothing changed

10,000 deletions at a time works fine on my end - it took about 45 seconds for the dashboard to reflect this. 20,000 deletions took 75 seconds. This would just be your server maxing out its resources. You can run some tests to find out the limit you can use for your server with its current load.

A smaller limit of a few thousand would be fine which you can repeatedly unless youā€™re adding tens of thousands of objects every few seconds.

Otherwise look at doing it directly through MongoDB. You just extend out your API use to include your own server/backend.

Are you generating a cloud function or a local Moralis instance? Mind showing the code?

Whatā€™s the recommended way here? Mongoose? Extend the base Moralis object? I tried Mongoose but got no results. Do I also need to WL IP addresses? Also, what is the default db name for connections? Is it just the dApp name? Any pointers would be helpful.

Never mind, I found the parse table. Without searching the forums (or guessing based on the Python example) this isnā€™t immediately clear. Also had to WL my IP.

It would really be appreciated if whomever is responsible for the docs would provide more detail and more explicit examples. I realize this isnā€™t practical for all cases, but when you have examples connecting to admin with no (obvious) examples of where to find the names of other tables, it makes for much :dizzy_face:ā€:dizzy:

In summary:

  • Use Mongo client to connect directly
  • Use parse table
  • WL IP address
  • Make use of MongoClient.db(...).collection(...).deleteMany() [I prefer to break these up into separate objects but just used a single line to demo the ā€œrelationshipsā€

Anyway, thanks for the input.

NOTE: This entire listing is, of course, a workaround due to bulkDeleteMany Cloud functions simply not working well for me.

NOTE: .deleteMany() may time out if the result set is too large, in which case you may .find().limit() and loop through.

1 Like