MySQL Remove Duplicates

If we have a table like that and want to remove rows with duplicate name fields.

+----+--------+
| id | name   |
+----+--------+
| 1  | google |
| 2  | yahoo  |
| 3  | msn    |
| 4  | google |
| 5  | google |
| 6  | yahoo  |
+----+--------+

1) If you want to keep the row with the lowest id value:

DELETE n1 FROM names n1, names n2 
WHERE n1.id > n2.id AND n1.name = n2.name

2) If you want to keep the row with the highest id value:

DELETE n1 FROM names n1, names n2 
WHERE n1.id < n2.id AND n1.name = n2.name

ref: http://stackoverflow.com/questions/4685173/delete-all-duplicate-rows-except-for-one-in-mysql

Advertisement

Get Table Columns And Sizes

You can get SQL Server table columns and sizes with this query. Just change “___TABLE___NAME___” value with your table name.

 

CREATE TABLE #temp
(
colname varchar(50) NULL,
collen int NULL
)

INSERT INTO #temp (colname, collen)
SELECT column_name, character_maximum_length
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_name = ‘___TABLE___NAME___’
and data_type in(‘varchar’,’char’,’nvarchar’,’nchar’)

SELECT * FROM #temp

DROP TABLE #temp

Apache CouchDB

Apache CouchDB™ is a database that uses JSON for documents, JavaScript for MapReduce queries, and regular HTTP for an API

CouchDB is a database that completely embraces the web. Store your data with JSON documents. Access your documents with your web browser, via HTTP. Query, combine, and transform your documents with JavaScript. CouchDB works well with modern web and mobile apps. You can even serve web apps directly out of CouchDB. And you can distribute your data, or your apps, efficiently using CouchDB’s incremental replication. CouchDB supports master-master setups with automatic conflict detection.

CouchDB comes with a suite of features, such as on-the-fly document transformation and real-time change notifications, that makes web app development a breeze. It even comes with an easy to use web administration console. You guessed it, served up directly out of CouchDB! We care a lot about distributed scaling. CouchDB is highly available and partition tolerant, but is also eventually consistent. And we care a lot about your data. CouchDB has a fault-tolerant storage engine that puts the safety of your data first.

See the introductiontechnical overview, or one of the guides for more information.

MongoDB

MongoDB (from “humongous”) is a scalable, high-performance, open source NoSQL database. Written in C++, MongoDB features:

Document-oriented storage » JSON-style documents with dynamic schemas offer simplicity and power.

Full Index Support » Index on any attribute, just like you’re used to.

Replication & High Availability » Mirror across LANs and WANs for scale and peace of mind.

Auto-Sharding » Scale horizontally without compromising functionality.

Querying » Rich, document-based queries.

Fast In-Place Updates » Atomic modifiers for contention-free performance.

Map/Reduce » Flexible aggregation and data processing.

GridFS » Store files of any size without complicating your stack.

Commercial Support » Enterprise class support, training, and consulting available.

JustOneDB

The Relational Database for Big Data

JustOneDB is a new class of database – a NewSQL database that feels like a traditional relational database yet performs and adapts to change like no other.

The likelihood is that your application is best suited to a relational database – most are. But with exploding data volumes driving spiraling hardware and software license costs, the options for keeping pace with the tsunami of data are daunting.

JustOneDB removes all of that pain. It can handle the biggest data volumes today but at a fraction of the cost and complexity of alternative solutions.

Fast Facts
Fully-functional relational database
– SQL99 compliant
– Fully transactional
– PostgreSQL compatible
– Industry standard interfaces for BI tools and languages
– No need to design indexes, partitions for query performance
– No need for schema transformations
– Very fast row inserts
– Index-like query performance
– Concurrent updates and queries
– Can use DAS, NAS and SAN storage
– Supports stored procedures

Performance
Performance per SATA HDD and 3GHz CPU
– Insert up to 500,000 column values per second
– Eliminate up to 1 billion rows per second for selective queries

Capacity
– Up to 65535 tables per database
– Up to 1024 columns per table
– Up to 65535 bytes per text value
– Number values +/- 10
75 at up to 512 bit precision

Limitations
Release 1.1 has the following temporary limitations that will be removed in future releases
– Analytical queries currently use conventional join strategies and row aggregation and perform similarly to a fully indexed row store

– Features not currently supported:
– Triggers
– Save-points
– Unique and key constraints
– Text search
– Spatial search
– Object extensions

RavenDB

RavenDB is a transactional, open-source Document Database written in .NET, offering a flexible data model designed to address requirements coming from real-world systems.

RavenDB allows you to build high-performance, low-latency applications quickly and efficiently.

Features:

– Safe by default

Based on years of experience with real, live enterprise systems, RavenDB is built to ensure data access is done right. No locking, no abuse of network or system resources. With RavenDB your application is guarateed to be as fast as and reliable as it gets.

– Transactional

ACID transactions are fully supported, even between different nodes. If you put data in, it is going to stay there. We care about your data and we keep it safe.

– Scalable

Sharding, replication and multi-tenancy are supported out-of-the-box. Scaling out is as easy as it gets.

– Schema free

Forget about tables, rows, mappings or complex data-layers. RavenDB is a document-oriented database you can just dump all your objects into.

– Get running in 5 minutes

5 minutes, that’s all it takes to start using RavenDB. Designed not to get in your way, RavenDB requires no complex installation process, just download and run. Check out our Quickstart Tutorials

It Just Works

Stop fighting the database and get ready to go into a world full of fun, with a database that cares. The fluent and intuitive API makes building data backed applications a breeze. As a guideline, zero-administration is required to the server. Just unzip, run and start writing code.

Fast queries

RavenDB can satisfy any query in the speed of light, as no processing whatsoever is being made to satisfy queries. All indexing operations are done in the background, and have no effect on querying, writing or reading from the database.

Best practices built in

Enjoy working with the bleeding edge of modern software development, using friction-free methodolgies.

High performance

RavenDB is a very fast persistence layer for every type of data model. Skip creating complicated mapping or multi-layer DALs, just persist your entities. It Just Works, and it does the Right Thing.

Caching built in

Multiple level of caches operate automatically both on the server and on the client by default, transparently. Yet, caching is completely configurable and advanced modes like Aggressive Caching exist.

APIs

Access RavenDB from any language and technology. Client / Server communication is done via REST over HTTP, and client APIs for .NET (including Linq and F# support), Silverlight and Javascript

Built-in management studio

Easily manage your database and data using the graphical UI bundled with every instance of RavenDB server.

Carefully designed

Every bit of code was carefully considered. RavenDB was designed with best-practices in mind, and it ensures that everything Just Works.

Map / Reduce

Indexes are defined using easy to write Map/Reduce functions written in Linq syntax. By supporting concepts like multi-maps and boosting indexes are so simple to write, yet very powerful.

Feature rich and extensible

Built with extensibility in mind, RavenDB can be easily extended both on the client and the server. Many integration points ensure you can always squeeze more out of RavenDB. You aren’t shackled to a One Size Fits None solution.

Embeddedable

RavenDB can be embedded in any .NET application, making it a perfect fit also for desktop applications.

Bundles

RavenDB ships with server-side plugins extending it in various helpful ways. It is just a matter of dropping a DLL to the server folder.

Index replication to SQL

To allow you to take advantage of the reporting tools available in the relational world, RavenDB allows you to easily replicate indexes to SQL tables.

Full-text search built in

No need to plug in any external tool to support advanced searches on text fields. Full-text searches are supported out of the box by the server and the client API.

Advanced search techniques

The built-in full-text search engine (Lucene) allows RavenDB to support a lot of other cool stuff, including and not limited to:

Geo-spatial search support

Out of the box, with easy to use API

Easy backups

Make backups asynchronously, without disturbing the normal DB operations. Backup and Restore are both supported by the DB, a utility tool to make the process even easier is bundled with the server.

Multi-tenancy

Host multiple databases in one RavenDB server.

Attachments

RavenDB supports storing data streams that are not actual data, like images and other binary data you don’t want to store as a document but still want available.

Online Index Rebuilds

Indexes are updated in the background, without requiring any interaction from the user or the normal ACID operation of the database.

Fully async (C# 5 ready)

RavenDB already supports the brand new async API intruced by C# 5

Community

RavenDB enjoys a great and supportive community you can meet in the mailing list and on JabbR.

Cloud hosting available

No need to host the server yourself. Run RavenDB on the cloud with RavenHQ, CloudBird, AppHarbor or Windows Azure.